Monday, February 28, 2011

multicast.. forever pain in the ass

now this is getting a bit irritating.. stuck for a few days.. today managed to zoom in to the culprit router. below is the diagram

R7, R8 and R9 is running MPLS VPN backbone with multicast backbone as well. Multicast VPN has been configured over.. control plan works fine.. R5 join address 225.5.5.5. Ping 225.5.5.5 from R4 is not happening.

Sniffing shows R7 is sending traffic to 225.5.5.5 using multicast tunnel address 239.0.0.1, and this is captured by R9. From R9, the multicast route table looks like this:

R9#
00:14:19: %SYS-5-CONFIG_I: Configured from console by console
R9# sh ip mroute vrf ABC2
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report, Z - Multicast Tunnel
Y - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 225.5.5.5), 00:23:21/00:02:48, RP 172.16.0.7, flags: S
Incoming interface: Tunnel0, RPF nbr 12.12.0.7
Outgoing interface list:
FastEthernet0/0.59, Forward/Sparse, 00:23:21/00:02:48

(172.16.48.4, 225.5.5.5), 00:01:57/00:01:03, flags: TY
Incoming interface: Tunnel0, RPF nbr 12.12.0.8, MDT:239.0.8.0/00:02:03
Outgoing interface list:
FastEthernet0/0.59, Forward/Sparse, 00:01:57/00:02:48

(*, 224.0.1.40), 00:23:27/00:02:56, RP 0.0.0.0, flags: DCL
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list:
Loopback1, Forward/Sparse, 00:23:27/00:02:56

ok now already switched to MDT data group.. Cos R7 is happily sending the traffic.. fine. R9 is not sending anything out:

R9# sh ip mroute vrf ABC2 cou
IP Multicast Statistics
3 routes using 1720 bytes of memory
2 groups, 0.50 average sources per group
Forwarding Counts: Pkt Count/Pkts per second/Avg Pkt Size/Kilobits per second
Other counts: Total/RPF failed/Other drops(OIF-null, rate-limit etc)

Group: 225.5.5.5, Source count: 1, Packets forwarded: 2, Packets received: 2
RP-tree: Forwarding: 2/0/100/0, Other: 2/0/0
Source: 172.16.48.4/32, Forwarding: 0/0/0/0, Other: 0/0/0

Group: 224.0.1.40, Source count: 0, Packets forwarded: 0, Packets received: 0

Debug ip mpacket detail tells why:

R9# deb ip mpa de
IP multicast packets debugging is on (detailed)
R9#
00:33:41: IP(0): MAC sa=*Tunnel* (Tunnel0)
00:33:41: IP(0): IP tos=0x0, len=100, id=1280, ttl=252, prot=1
00:33:41: IP(0): s=172.16.48.4 (Tunnel0) d=225.5.5.5 (FastEthernet0/0.69) id=1280, ttl=252, prot=1, len=100(100), not RPF interface
R9#
00:33:43: IP(0): MAC sa=ca08.1a44.0008 (FastEthernet0/0.79), IP last-hop=12.12.79.7
00:33:43: IP(0): IP tos=0x0, len=124, id=14896, ttl=254, prot=47
00:33:43: IP(0): MAC sa=ca08.1a44.0008 (FastEthernet0/0.79), IP last-hop=12.12.79.7
00:33:43: IP(0): IP tos=0x0, len=124, id=14897, ttl=254, prot=47
00:33:43: IP(0): MAC sa=*Tunnel* (Tunnel0)
00:33:43: IP(0): IP tos=0x0, len=100, id=1280, ttl=253, prot=1
00:33:43: IP(0): s=172.16.47.4 (Tunnel0) d=225.5.5.5 (FastEthernet0/0.69) id=1280, ttl=253, prot=1, len=100(100), not RPF interface
00:33:43: IP(0): MAC sa=*Tunnel* (Tunnel0)
00:33:43: IP(0): IP tos=0x0, len=100, id=1280, ttl=252, prot=1
00:33:43: IP(0): s=172.16.0.4 (Tunnel0) d=225.5.5.5 (FastEthernet0/0.69) id=1280, ttl=252, prot=1, len=100(100), not RPF interface
00:33:43: IP(0): MAC sa=ca08.1a44.0008 (FastEthernet0/0.79), IP last-hop=12.12.79.7
00:33:43: IP(0): IP tos=0x0, len=124, id=14898, ttl=254, prot=47
00:33:43: IP(0): MAC sa=*Tunnel* (Tunnel0)

This is where things start not to make sense. We see the OIL listed fa0/0.59 as the egress interface for group 225.5.5.5, however debug shows that R9 is trying to forward it to fa0/0.69 and gives a "not rpf interface" error?!

-----------------------------------------------------------
Action:
-- add 'bgp next-hop loopback', cos R9 is peering ebgp over fa0/0.69, no good.
-- shutdown loopback 2, no good.
-- shutdown fa0/0.69, everything works. Well this is not a solution
------------------------------------------------------------------
thought:
-- R9 has two loopback address in global table.. i seem to remember that could give some sort of problem when the backbone multicast is not using MDT address family, even through in my case, i use PIM BSR instead of PIM SSM to build the backbone. Research this more tomorrow.. anyway my code does not support MDT address family. XD
-- there is a l2tp tunnel forming through fa0/0.69, related?




Wednesday, February 16, 2011

今天没有读书

啊我废了今天。。