Introduction
Two key functions
- forwarding: move packets from router’s input to appropriate router output
- routing: determine route taken by packets from source to destination.
Forwarding Table
header value | Output Link |
---|---|
0100 | 3 |
0101 | 2 |
0111 | 2 |
1001 | 1 |
Network Service Model
Example services for individual datagrams
- guaranteed delivery
- guaranteed delivery with less than 40 msec delay
Example services for a flow of datagrams
- in-order datagram delivery
- guaranteed minimum bandwidth to flow
- restrictions on changes in inter-packet spacing
Virtual Circuit and Datagram Networks
Connection, Connect-less Service
- Datagram network provides network-layer connectionless service.
- Virtual-circuit network provides network-layer connection service.
Analogous to TCP/UDP connecton-oriented / connectionless transport-layer services, but: - service: host-to-host - no choice: network provides one or the other - implementation: in network core
Virtual Circuits
source-to-dest path behaves much like telephone circuit
A VC consists of
- path from source to destination
- VC numbers, one number for each link along path
- entries in forwarding tables in routers along path
packet belonging to VC carries VC number (rather than dest address)
VC number can be changed on each link. - new VC number comes from forwarding table
Signaling protocols
- used to serup maintian and teardown VC
- used in ATM, frame-relay
- not used in today's Internet
Datagram Networks
- no call setup at network layer
- routers: no state about end-to-end connections
- no network-level concept of “connection”
- packets forwarded using destination host address
Datagram forwarding table
- Longest prefix matching: when looking for forwarding table entry for given destination address, use longest address prefix that matches destination address.
Router
There are two key router functions: - run routing algorithms/protocol (RIP, OSPF, BGP) - forwarding datagrams from incoming to outgoing link
Input port functions
Switching fabrics
Switching fabrics transfer packet from input buffer to appropriate output buffer.
Switching rate: rate at which packets can be transfer from inputs to outputs - often measured as multiple of input/output line rate - N inputs: switching rate N times line rate desirable
- Switching via memory: speed limited by memory bandwidth (2 bus crossings per datagram)
- bus contention: switching speed limited by bus bandwidth
- Switching via interconnection network: overcome bus bandwidth limitations
Output ports
- Buffering required when datagrams arrive from fabric faster than the transmission rate
- Scheduling discipline chooses among queued datagrams for transmission
Output port queueing
Fabric slower than input ports combined -> queueing may occur at input queues - queueing (delay) and loss due to output port buffer overflow!
Head-of-the-Line (HOL) blocking: queued datagram at front of queue prevents others in queue from moving forward
IP: Internet Protocol
IP datagram format
IP fragmentation
Maximum Transmission Unit (MTU): largest possible link-level frame.
- different link types have different MTUs
large IP datagram divided (“fragmented”) within net
- one datagram becomes several datagrams
- “reassembled” only at final destination
- IP header bits used to identify, order related fragments
e.g. We have a 4000 byte datagram with MTU = 1500 bytes
Fragments | Length | ID | Offset | Fragflat |
---|---|---|---|---|
1 | 1500 | x | 0 | 1 |
2 | 1500 | x | 185 | 1 |
3 | 1020 (=3980-1480-1480) | x | 370 | 0 |
IP addressing
Classless InterDomain Routing (CIDR) - subnet portion of address of arbitrary length - address format: a.b.c.d/x, where x is # bits in subnet portion of address
Subnets
- What's a subnet?
- device interfaces with same subnet part of IP address
- can physically reach each other without intervening router
How does a host get IP address?
- Dynamic Host Configuration Protocol (DHCP): dynamically get address from as server
- DHCP can return more than just allocated IP address on subnet:
- address of first-hop router for client
- name and IP address of DNS sever
- network mask (indicating network versus host portion of address)
NAT: network address translation
motivation: local network uses just one IP address as far as outside world is concerned - range of addresses not needed from ISP - can change addresses of devices in local network without notifying outside world - can change ISP without changing addresses of devices in local network
ICMP: Internet Control Message Protocol
ICMP is used by hosts and routers to communicate network level information. - error reporting - echo request
ICMP message: type, code plus first 8 bytes of IP datagram causing error.
ICMP Type | Code | Description |
---|---|---|
0 | 0 | Response |
3 | 1 | Dest host not reachable |
... | ... | .... |
IPv6
Motivation
- initial motivation: 32-bit address space soon to be completely allocated.
- additional motivation:
- header format helps speed processing/forwarding
- header changes to facilitate QoS
Format
- fixed-length 40 byte header
- no fragmentation allowed
- priority: identify priority among datagrams in flow
- flow Label: identify datagrams in same “flow.”
- checksum: removed entirely to reduce processing time at each hop
- options: allowed, but outside of header, indicated by “Next Header” field
- ICMPv6: new version of ICMP
Tunneling
Tunneling: IPv6 datagram carried as payload in IPv4 datagram among IPv4 routers
Routing ALgorithm
Routing algorithm classification
global vs decentralized
- global routing algorithm
- all routers have complete topology, link cost info
- “link state” algorithms
- decentralized routing algorithm
- router knows physically-connected neighbors, link costs to neighbors
- “distance vector” algorithms
static or dynamic
- statistic: routes change slowly over time
- dynamic: routes change more quickly
Link-State Routing Algorithm
Dijkstra’s algorithm
- Algorithm complexity:
- n(n+1)/2 comparison: \(O(n^2)\)
Step | N' | v | w | x | y | z |
---|---|---|---|---|---|---|
0 | u | 2, u | 5, u | 1, u | ++ | ++ |
1 | ux | 2, u | 4, x | 2, x | ++ | |
2 | uxy | 2, u | 3, y | 4, y | ||
3 | uxyv | 3, y | 4, y | |||
4 | uxyvw | 4, y | ||||
5 | uxyvwz |
Distance vector
\(d_x(y)=\min\{c(x,v)+d_v(y)\}\)
link cost changes: - node detects local link cost change - bad news travels slow - “count to infinity” problem! - 44 iterations before algorithm stabilizes
Hierarchical Routing
- We have scale with 600 million destinations, and we cannot store all dest's in routing tables.
- administrative autonomy: each network admin may want to control routing in its own network
Autonomous systems (AS): aggregate routers into regions - intra-AS routing protocol is routers in same AS run same routing protocol. - routers in different AS can run different intra-AS routing protocol
** Gateway router** has link to router in another AS.
Routing in the Internet
RIP (Routing Information Protocol)
It uses distance vector algorithm - distance metric: # hops (max = 15 hops), each link has cost 1 - DVs exchanged with neighbors every 30 sec in response message (aka advertisement) - each advertisement: list of up to 25 destination subnets (in IP addressing sense)
RIP routing tables managed by application-level process called route-d (daemon) - sent in UDP packets, periodically repeated.
OSPF (Open Shortest Path First)
It uses link state algorithm - LS packet dissemination - topology map at each node - route computation using Dijkstra’s algorithm
advertisements flooded to entire AS - carried in OSPF messages directly over IP (rather than TCP or UDP
IS-IS routing protocol: nearly identical to OSPF
BGP (Border Gateway Protocol)
BGP provides each AS a means to: - eBGP: obtain subnet reachability information from neighboring ASs. - iBGP: propagate reachability information to all AS-internal routers.
to be completed...
Broadcast and Multicast Routing
We want to deliver packcet from source to all other nodes, and source duplication is ineficient.
Broadcasting Routing
- Flooding: when node receives broadcast packet, sends copy to all neighbors
- problems: cycles & broadcast storm
- Controlled flooding: node only broadcasts pkt if it hasn’t broadcast same packet before
- node keeps track of packet ids already broadcasted
- or reverse path forwarding (RPF): only forward packet if it arrived on shortest path between node and source
- Spanning tree: no redundant packets received by any node
- first construct a spanning tree
- nodes then forward/make copies only along spanning tree
Multicast Routing
- goal: find a tree (or trees) connecting routers having local multicast group members
- tree: not all paths between routers used
- shared-tree: same tree used by all group members
source-based: different tree from each sender to rcvrs
- approaches:
- source-based tree: one tree per source
- shortest path trees
- reverse path forwarding
- group-shared tree: group uses one tree
- minimal spanning (Steiner)
- center-based trees
Choose multicasting routing
- Distance vector multicast routing protocol (DVMRP)
- RPF tree based on DVMRP’s own routing tables constructed by communicating DVMRP routers
- Protocol independent multicast (PIM): not dependent on any specific underlying unicast routing algorithm (works with all)
- dense: group members densely packed, in “close” proximity; bandwidth more plentiful
- sparse: # networks with group members small wrt # interconnected networks; group members “widely dispersed”; bandwidth not plentiful