• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

SOLVED PFSense dropping internet

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

Stratus_ss

Overclockix Snake Charming Senior, Alt OS Content
Joined
Jan 24, 2006
Location
South Dakota
NOTE: I posted this question in the PFSense forums but we have some smart people here. Also, OCForums seems to have a faster response time.

Problem: I have a cable internet service that has been having issues. During the last 5-7 weeks the internet will drop at random times, the cable company has acknowledged the issue. However, they are sure they have fixed the issue. I still see internet outages with some frequency (due to packet loss).

Background info: PFSense running directly behind the cable modem. I have setup smokeping on 2 computers behind pfsense to monitor a half dozen sites both via DNS and actual IPs. They report packet loss which agrees with PFSense's packet loss RRD. I have been running PFSense with the same setup for the last 2 years (updating as required) so I am now on PFSense 2.1. I have taken the extrodinary step of installing a second PFSense instance and turning off the first thinking that maybe upgrading between versions left some cruft that was exposed by my ISP's issues.

PFSense problems: I believe that at this point it is actually PFSense's fault. The modem's uptime has been good, signal strength to noise ratio has normalized etc.

Troubleshooting done: I put a switch between the modem and the PFSense box, my internet provider allows me to pull up to 3 IP's off my modem. One port of the switch goes to PFSense and then to the rest of the house. The other port goes into a spare machine which I have wiped, hardened and installed smokeping on. Over the last several days I have noted 100% packet loss 2-3 times a day for PFSense and machines behind PFSense. On the hardened machine on the switch, no such service interruptions have been seen. This is why I believe it to be a PFSense issue

I have not changed anything on the PFSense box since its initial setup so I am not really sure where the problem is. I have changed the machines (thus nics), I have changed the ethernet cables as well. I have tried virtualizing PFSense and importing the configs with the same result.

I am not doing anything overly complex with pfsense. I have half dozen forwarding rules, I am running OpenVPN and I have only 3 packages installed: OpenVPN BandwidthD and RRD summary.

I have attempted dropping the interfaces to 100TX/Full Duplex as I found suggested in a thread here, but that has not made any appreciable change that I can find.

Observation: the problem will clear itself in 15-25 minutes or if I reboot PFSense it will fix itself when PFsense comes back online.

I would appreciate any troubleshooting hints/tips. When saying things like "check this log" I would appreciate the kindness of specifying the location of said resource
 
Hmm. I'm running 2.1 as a full install and I just checked for any errors on mine. About 36 days now with no errors on the interface. I've had my internet go offline at night before, but when the signal comes back, it grabs the IP from the DHCP with no issue. I also have OpenVPN installed along with a few other plugins.

I would have suggested swapping the NIC, but I see you have already swapped hardware and ethernet cables.

Are you dropping packets on the WAN or LAN side? Have you run wireshark on the LAN side and see if you are being flooded with ARP broadcasts or DHCP requests? It's possible a machine in the LAN got reconfigured and is wreaking havoc.
 
Are you dropping packets on the WAN or LAN side? Have you run wireshark on the LAN side and see if you are being flooded with ARP broadcasts or DHCP requests? It's possible a machine in the LAN got reconfigured and is wreaking havoc.

Its dropping on the WAN_DHCP RRD. I haven't run wireshark. I can, though I am not sure what that would accomplish as it seems to only be on the WAN side of things.

I may try reverting to 2.0. My next thought would be to do a 2.1 install and then not importing my config. Maybe my config is causing problems, but since I haven't changed that I am not sure why it would be. Regardless, today's task is to run a barely configured 2.1 box for a bit and see what happens.

Is there any way to bring over the RRD graphs so as to not lose the historical data?
 
Well, I'll BE! I thought our ISP was the problem...

I'm having the exact same problem that you are, and I've been blaming our internet provider. I thought that perhaps the modem could be flaky, but it never really occurred to me that the pfsense box could be the problem.

With this starting place, I'll begin working.

Also, I have no idea how to transfer RRD graphs, sorry!
 
The modem is a Thompson DCM476


I have installed pfsense 2.1 from the memory stick image. I did not import my settings. This time all I have done is setup DYDNS and set the unblock-us nameservers. We'll see how it goes
 
I haven't had any outages yet (its only been 1 hour). I will look at the gateway logs and report back when I do have an outage

Thanks for the tip!
 
I found something I wonder if that was causing the problem. When the high packet loss is registered, the dhcp gateway from the cable modem goes off line. My modem (192.168.100.1) offers RFC1918 ips constantly (i.e. private IPs). This was causing my router (pfsense) to be spammed by a large number of

Code:
kernel: arpresolve: can't allocate llinfo for 192.168.100.1

Under

Interfaces -> WAN

There is an option which I had overlooked until now "Reject Leases From". After reading the forums I realized this option was put in specifically to deal with my problem.

I have put my modem's IP in that column and the errors for arpresolve have disappeared.

It is unknown if this will be a long time fix or not
 
I had 1 yesterday between 22:10-22:20. I have to verify whether or not it was a problem with the line or a problem with pfsense.

Definitely has minimized the occurances so far though. (Hopefully its not coincidental)
 
So I had another outage today. The down time has definitely decreased between disconnect/reconnect so thats positive. I need to compare it to my box that is sitting exposed to the interweb to see if it was a modem issue or a pfsense issue
 
So I have confirmed that it was a pfsense issue.

I went out and got 2.0.3 image and put it on. I will monitor. If I have to, I am going to go all the way back to 2.0.1, otherwise I may have to try rolling my own router or going with untangled or something similar.

I dont want to use an off the shelf router (dlink etc) because I dont think they offer the flexibility I want/need
 
I used to have this exact issue occur occasionally with my PFsense box. I have Comcast and don't use DDNS but I am trying to remember the steps I took to resolve it XOR track down the links I used to test with.

ONE item I have located is a bug with Apinger.

I am experiencing this but it is random and typically lasts ~1-2 minutes.

These MAY not apply or help BUT from my various digging and incomplete barely worth calling notes (which I really need work on so I don't have to retrace my steps from square one so often)...

If I didn't mention it, it was left blank/unset.
From the dashboard

  • >Setup
    • >Advanced
      • >Firewall/NAT Tab, what do you have checked/set?
  • I have IP DO-NOT-FRAGMENT, Static route filtering and Disable all auto-added VPN rules checked. Firewall Optimization is set to normal.
  • Bogon Networks is set to Monthly.
  • Network Address Translation is set to Enable (NAT + Proxy)
  • Enable NAT Reflection and Enable automatic outbound are both checked.
      • >Networking Tab, do you have anything checked here?
  • I unchecked everything here in my troubleshooting various things and never re-enabled any of them.
      • >Miscellaneous Tab
  • Gateway Monitoring only has State Killing on Gateway Failure checked.
Have you tried checking Disable all packet filtering on the Firewall/NAT tab to test?



  • Interfaces dropdown
    • WAN
      • I have DHCP set for IPv4 Configuration Type
      • None for IPv6
      • Speed and duplex is set to Default. Even though it is running at 1 Gbps if I try to set it to this, it fails.
      • Entire rest of section is cleared/unchecked.


Again, not sure that those matter or apply to this but just trying to remember/recreate what I did to resolve the issue.

Attached is a SS of my dashboard showing I am running latest 2.1 x64.
 
It seems that some of these options have moved. To answer some of the questions

Networking Tab, do you have anything checked here?
These options are checked
-> Disable hardware TCP segmentation offload
-> Disable hardware large receive offload

I disabled IPv6 already, there is a single option under Advanced -> Miscellaneous for states. Its description says

By default the monitoring process will flush states for a gateway that goes down. This option overrides that behavior by not clearing states for existing connections.

It is checked.

Speed and Duplex is set to default.

Have you tried checking Disable all packet filtering on the Firewall/NAT tab to test?

I have not. I have however, done a fresh install with a brand new config (as in whatever PFSense does out of the box) and still experienced the same issues
 
Two more outages since I downgraded to 2.0.3. I have gone to System -> Routing -> Edit Gateway and Checked "Disable Gateway Monitoring"

And the testing continues...
 
After a few days with gateway monitoring disabled, I have noticed no appreciable change (still running 2.0.3). I have re-enabled gateway monitoring and set the Frequency Probe to 3 and down to 20 as well as the low water mark for Package Loss to 25%.
 
This might be a silly question but what kind of hardware are you using?
Whats your connection speed and what are you doing when you get an outage.

Just making sure your hardware is up for the task.

My Via Nano gave me a similar Issue years back under certain loads when I upgraded my connection to 50Mbps in Vancouver.

Also, you have swapped your cable out from your Cable modem to your PFSense box correct?

Some times its a stupid kink in the cable that can cause your cable to fail.
 
This might be a silly question but what kind of hardware are you using?
Whats your connection speed and what are you doing when you get an outage.

Just making sure your hardware is up for the task.

Its
Code:
AMD Athlon(tm) Dual Core Processor 4850e
2 CPUs: 1 package(s) x 2 core(s)
Memory usage 	3% of 7114 MB

Sitting on top of an SSD (cuz I had a spare one)

Its simply 14M Cable connection

Also, you have swapped your cable out from your Cable modem to your PFSense box correct?

Some times its a stupid kink in the cable that can cause your cable to fail.

I have tried 3 different ethernet cables, and 2 different computers for a total of 5 nics. (one computer had 3 nics, the other has 2).

As an update, I have registered the same errors in the modem that corresponded with outages in the last few days. However, since I have changed the polling and down integers, I haven't had a complete outage. I am hopeful that this is a good workaround.

As for activity during the outages, I believe I have previously mentioned that it is completely random, sometimes no one is even home when the monitoring software and router report the outage. Sometimes its netflix, sometimes its WoW, sometimes its just general browsing. There doesn't seem to be any correlation to upload bandwidth, or bandwidth usage in general
 
Back