Skip to content

Conversation

@SudharmaJain
Copy link
Contributor

…P-START/RECREATE

The ongoing ICMP request reply session is broken when the VR is down, the expectation is that it would resume once the VR is up. Investigations revealed that the ongoing ICMP packets are sent out of eth2 without being NATed post VR stop/start or restart or recreate.

TCPDUMP output from VR post restart/stop-start/recreate on eth2:

root@r-4-VM:# tcpdump -i eth2 icmp -n -vvv
tcpdump: listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes
06:22:52.749770 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.200.67 > 173.194.33.163: ICMP echo request, id 30996, seq 81, length 64
06:22:53.749782 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.200.67 > 173.194.33.163: ICMP echo request, id 30996, seq 82, length 64
06:22:54.749771 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.200.67 > 173.194.33.163: ICMP echo request, id 30996, seq 83, length 64
06:22:55.749775 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.200.67 > 173.194.33.163: ICMP echo request, id 30996, seq 84, length 64
06:22:56.749765 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.200.67 > 173.194.33.163: ICMP echo request, id 30996, seq 85, length 64
06:22:57.749776 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.200.67 > 173.194.33.163: ICMP echo request, id 30996, seq 86, length 64
^C
6 packets captured
6 packets received by filter
0 packets dropped by kernel
root@r-4-VM:
#
root@r-4-VM:~# grep icmp /proc/net/ip_conntrack
icmp 1 29 src=192.168.200.67 dst=173.194.33.163 type=8 code=0 id=30996 [UNREPLIED] src=173.194.33.163 dst=192.168.200.67 type=0 code=0 id=30996 mark=0 use=2

This get fixed after flushing the conntrack table.

Screenshots:

Before fix (ping session doesn't resume, stop and starting the ping works, 120 packets lost):
image

After fix(ping session resumes, 27 packets lost):
image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of ps | grep, why not use status result and start, like: service conntrackd status || service conntrackd start (or other commands) etc.

@asfbot
Copy link

asfbot commented Sep 16, 2015

cloudstack-pull-rats #622 SUCCESS
This pull request looks good

@asfbot
Copy link

asfbot commented Sep 16, 2015

cloudstack-pull-analysis #561 SUCCESS
This pull request looks good

@asfbot
Copy link

asfbot commented Sep 16, 2015

cloudstack-pull-rats #630 SUCCESS
This pull request looks good

@SudharmaJain
Copy link
Contributor Author

@bhaisaab , using service conntrackd status don't provide the status and always gives the usage information for conntrackd. So i have added the changes by starting conntrackd in deamon mode with (-d) switch.

@asfbot
Copy link

asfbot commented Sep 16, 2015

cloudstack-pull-analysis #569 UNSTABLE
Looks like there's a problem with this pull request

@asfbot
Copy link

asfbot commented Sep 17, 2015

cloudstack-pull-rats #639 SUCCESS
This pull request looks good

@asfbot
Copy link

asfbot commented Sep 17, 2015

cloudstack-pull-analysis #578 SUCCESS
This pull request looks good

@asfbot
Copy link

asfbot commented Sep 17, 2015

cloudstack-pull-analysis #590 SUCCESS
This pull request looks good

@wilderrodrigues
Copy link
Contributor

@jayapalu @karuturi

We, @remibergsma, Funs and I will test the 3 PRs which are VR related now.

remibergsma added a commit to remibergsma/cloudstack that referenced this pull request Sep 24, 2015
CLOUDSTACK-8863: VM doesn't reconnect to internet post VR RESTART/STOP-START/RECREATE

The ongoing ICMP request reply session is broken when the VR is down, the expectation is that it would resume once the VR is up. Investigations revealed that the ongoing ICMP packets are sent out of eth2 without being NATed post VR stop/start or restart or recreate.

TCPDUMP output from VR post restart/stop-start/recreate on eth2:

root@r-4-VM:~# tcpdump -i eth2 icmp -n -vvv
tcpdump: listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes
06:22:52.749770 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
    192.168.200.67 > 173.194.33.163: ICMP echo request, id 30996, seq 81, length 64
06:22:53.749782 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
    192.168.200.67 > 173.194.33.163: ICMP echo request, id 30996, seq 82, length 64
06:22:54.749771 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
    192.168.200.67 > 173.194.33.163: ICMP echo request, id 30996, seq 83, length 64
06:22:55.749775 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
    192.168.200.67 > 173.194.33.163: ICMP echo request, id 30996, seq 84, length 64
06:22:56.749765 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
    192.168.200.67 > 173.194.33.163: ICMP echo request, id 30996, seq 85, length 64
06:22:57.749776 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
    192.168.200.67 > 173.194.33.163: ICMP echo request, id 30996, seq 86, length 64
^C
6 packets captured
6 packets received by filter
0 packets dropped by kernel
root@r-4-VM:~#
root@r-4-VM:~# grep icmp /proc/net/ip_conntrack
icmp     1 29 src=192.168.200.67 dst=173.194.33.163 type=8 code=0 id=30996 [UNREPLIED] src=173.194.33.163 dst=192.168.200.67 type=0 code=0 id=30996 mark=0 use=2

This get fixed after flushing the conntrack table.

Screenshots:

Before fix (ping session doesn't resume, stop and starting the ping works, 120 packets lost):
![image](https://cloud.githubusercontent.com/assets/12229259/9897800/4de7488e-5c6a-11e5-98eb-3bd79cc3a8b1.png)

After fix(ping session resumes, 27 packets lost):
![image](https://cloud.githubusercontent.com/assets/12229259/9897822/9112e866-5c6a-11e5-95b3-1b20600d2e44.png)

* pr/836:
  CLOUDSTACK-8863: VM doesn't reconnect to internet post VR RESTART/STOP-START/RECREATE

Signed-off-by: Remi Bergsma <[email protected]>
@asfgit asfgit merged commit 56d4429 into apache:master Sep 27, 2015
asfgit pushed a commit that referenced this pull request Sep 27, 2015
[BLOCKER] Combined PRs that fix VR issuesTonight I worked with @wilderrodrigues to figure out what is wrong with the virtual router. As we couldn't test single PRs any more (because of other issues with them causing tests to fail) we added all VR related PRs in a separate branch and started testing from there.

We combined the following PRs into this PR:
#836 #851 #867 #870 #881 #882 #842

After that, one issue remains: the VPC does not get a default gateway. Which is strange, because we already solved it in PR #738. When I look back, it was fixed again in PR #784. It could very well be that either one fixed one specific case, but also breaking the other. We need to investigate this, and make sure there will be a fix that works both for VPCs and VRs.

When we manually add the default gateway on the VPC, most tests pass and also spinning up two VPCs with one tier each, having a VM and them using s2s to VPN them together works fine. See for more details the report Wilder sent earlier.

Tomorrow we'll try to figure out how to fix the default gateway and merge this. Then we should have a base to work from again. Any PR that fixes another blocker, should at least then be rebased against the fixed master so we can run the tests against the PR branch. I'm not saying everything is fixed, I'm just saying that we can spin up a cloud that has working VMs.

When, in the mean time, someone has the time to checkout this branch and make the default route work for both VPC and VR that would be awesome. After that we should double check and verify the test results.

Pinging @karuturi to let her know the current status.

Regards,
Wilder / Remi

* pr/887:
  Fixing the index out of bounds error in the check_if_link_up() function
  small cleanups
  Fixing the defaut route for VPC routers
  Formatting the get_gateway() method in the CsDatabag.py file
  Fixing the dhcpsrvr iptables file
  Formatting the router_proxy.sh script
  CLOUDSTACK-8881: Fixed Static and PF configuration issue
  CLOUDSTACK-8905: Fixed hooking egress rules
  CLOUDSTACK-8891: Fixed default iptables rules on VR  for guest traffic
  Configured dnsmasq to listen on all interfaces so that vpn  client gets dns
  CLOUDSTACK-8864: Not able to add TCP port forwarding rule in VPN for specific ports
  CLOUDSTACK-8863: VM doesn't reconnect to internet post VR RESTART/STOP-START/RECREATE
  CLOUDSTACK-8843: Fixed issue in default iptables rules on shared network VR

Signed-off-by: Remi Bergsma <[email protected]>
@SudharmaJain SudharmaJain deleted the cs-8863 branch October 12, 2015 04:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants