[pox-dev] how to make recoco support more than 1024 fds?

Nan Zhu zhunanmcgill at gmail.com
Mon Dec 2 20:53:13 PST 2013


Hi, Murphy,   

See my inlined answers


--  
Nan Zhu
School of Computer Science,
McGill University


On Monday, December 2, 2013 at 11:03 PM, Murphy McCauley wrote:  
> On Dec 2, 2013, at 6:40 PM, Nan Zhu <zhunanmcgill at gmail.com (mailto:zhunanmcgill at gmail.com)> wrote:
>  
> > Hi, Murphy,
> >  
> > the epoll-enabled pox is working properly
>  
> Cool. Thanks for letting me know it still works. It's due to get refactored a bit at some point (this has been the longest-open github issue for POX), after which it should probably get a commandline switch (I've added a note to this effect on the issue).
>  
> > The only issue is that pox still sensitive to the connection rate,(I mentioned in the other thread)
> >  
> > I started 2880 switches and encountered connectException (sometimes connection reset by the peer) until I send only one connect request per second….
>  
> 2,880 simultaneous connections hasn't been a use-case we've seen a lot of... so the code certainly hasn't been tuned to make this work well. In general, the switches should retry every few seconds, so it's possible the few people who work with large numbers of switches (including myself on very rare occasions) have just relied on this.

all these switches are software implemented, so yes I will try to add some retry code to my implementation
  
>  
> Here are some questions:
>  
> * Are you saying you get different types of errors? Can you send snippets from the POX log?

I’m running some other experiments for a submission, after I finish that I will reproduce this and send you the log
  
>  
> * Which side do you get the connection reset by peer on (the POX side or the client/switch side)?

switch side
  
>  
> * Do you generally get roughly some number of connections just fine before you start having problems?

Yes, if the number of concurrent connections is around 720, I can connect with a rate of 10 per second
  
>  
> * Do you really need to slow it down to one per second? If you double the rate (one per half second) or double the number of connections (two at a time, once per second), you have problems?

yes, in my testbed, my software switch will throw ConnectException after around 300 connection have been established
  
>  
>  
> And here are a couple things to try (alone and together):
>  
> * Run POX (dart) with the experimental --unthreaded-sh option (at the beginning of the commandline). Any difference/improvement?
>  
> * Increase the socket backlog in of_01.py's call to listen() somewhere around line 874. On modern Linux, this can probably be up to 128 by default, and higher with some tweaking. Any difference/improvement?

I will try it
  
>  
>  
> -- Murphy
>  
> > I will look at this issue by using pox software switch
> >  
> > Best,
> >  
> > --  
> > Nan Zhu
> > School of Computer Science,
> > McGill University
> >  
> >  
> > On Monday, December 2, 2013 at 5:18 PM, Murphy McCauley wrote:
> >  
> > > http://ucb-sts.github.io/sts/
> > >  
> > > On Dec 2, 2013, at 2:28 PM, Nan Zhu <zhunanmcgill at gmail.com (mailto:zhunanmcgill at gmail.com)> wrote:
> > >  
> > > > BTW, what do you mean by STS?
> > > >  
> > > > --
> > > > Nan Zhu
> > > > School of Computer Science,
> > > > McGill University
> > > >  
> > > > On Monday, December 2, 2013 at 5:27 PM, Nan Zhu wrote:
> > > > > Yes, just found that with grep
> > > > >  
> > > > > I’m testing it
> > > > >  
> > > > > Thank you so much
> > > > >  
> > > > >  
> > > > > --
> > > > > Nan Zhu
> > > > > School of Computer Science,
> > > > > McGill University
> > > > >  
> > > > >  
> > > > > On Monday, December 2, 2013 at 5:12 PM, Murphy McCauley wrote:
> > > > >  
> > > > > > STS  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.noxrepo.org/pipermail/pox-dev-noxrepo.org/attachments/20131202/dd0edd73/attachment-0002.htm>


More information about the pox-dev mailing list