sharing port or socket #index page

Opening example – we all know that once a local endpoint is occupied by a tcp server process, another process can’t bind to it.

However, various scenarios exist to allow some form of sharing.

https://bintanvictor.wordpress.com/2017/04/29/socket-shared-between-2-processes/

https://bintanvictor.wordpress.com/2017/04/29/so_reuseport-socket-option/

https://bintanvictor.wordpress.com/2017/04/29/2-tcp-clients-connected-to-a-single-server-socket/

https://bintanvictor.wordpress.com/2017/03/29/multiple-sockets-can-attach-to-the-same-addressport/

## IV favorites ] sockets

There are dozens of sub-topics but in my small sample of interviews, the following sub-topics have received disproportionate attention:

  1. blocking vs non-blocking
  2. select()
  3. multicast
  4. accept() + threading
  5. add basic reliability over UDP (many blog posts); how is TCP transmission control implemented

Background — The QQ/ZZ framework was first introduced in this post on c++ learning topics

Only c/c++ positions need socket knowledge. However, my perl/py/java experience with socket API is still relevant.

Socket is a low-level subject. Socket tough topics feel not as complex as concurrency, algorithms, probabilities, OO design, MOM … Interview is mostly knowledge test; but to do well in real projects, you probably need experience.

Coding practice? no need. Just read and blog.

Socket knowledge is seldom the #1 selection criteria for a given position, but could be #3. (In contrast, concurrency or algorithm skill could be #1.)

  • [ZZ] tweaking
  • [ZZ] exception handling in practice
  • —-Above topics are still worth studying to some extent—–
  • [QQ] tuning
  • [QQ] buffer management

 

socket accept() key points often missed

I have studied accept() many times but still unfamiliar.

Useful as zbs, and perhaps QQ, rarely for GTD…

Based on P95-97 [[tcp/ip socket in C]]

  • (probably) used in tcp only
  • (probably) used on server side only
  • usually called inside an endless loop
  • blocks most of the time, when there’s no incoming new connections. The existing clients don’t bother us as they communicate with the “child” sockets independently. The accept() “show” starts only upon a new incoming connection
    • thread remains blocked, starting from receiving the incoming until a newborn socket is fully Established.
    • at that juncture the new remote client is probably connected to the newborn socket, so the “parent thread[2]” have the opportunity/license to let-go and return from accept()
    • now, parent thread has the newborn socket, it needs to pass it to a child thread/process
    • after that, parent thread can go back into another blocking accept()
  • new born or other child sockets all share the same local port, not some random high port! Until now I still find this unbelievable. https://stackoverflow.com/questions/489036/how-does-the-socket-api-accept-function-work confirms it.
  • On a host with a single IP, 2 sister sockets would share the same local ip too, but luckily each socket structure has at least 4 [1] identifier keys — local ip:port / remote ip:port. So our 2 sister sockets are never identical twins.
  • [1] I omitted a 5th key — protocol as it’s a distraction from the key point.
  • [2] 2 variations — parent Thread or parent Process.

good article: epoll() illustrated n contrasted^q(SELECT)

Compared to select(), the newer linux system call epoll() is designed to be more performant.

Ticker Plant uses epoll. No select() at all.

https://banu.com/blog/2/how-to-use-epoll-a-complete-example-in-c/ is a nice article with sample code of a TCP server.

  • main() function with an event loop
  • bind()
  • listen()
  • accept()

I think this toy program is more educational than a real-world epoll server with thousands of lines of code.

No sample client but I think a standard TCP client will do.

TCP client set-up steps #connect()UDP

TCP Client-side is a 2-stepper (look at Wikipedia and [[python ref]], among many references)
1) [SC] socket()
2) [C] connect()

[SC = used on server and client sides]
[S=server-only]
[C=client-only. seldom/never used on server-side.]

Note UDP is connection-less but connect() can be used too — to set the default destination. See https://stackoverflow.com/questions/9741392/can-you-bind-and-connect-both-ends-of-a-udp-connection.

Under TCP, The verb connect() means something quite different — “reach across and build connection”[1]. You see it when you telnet … Also, server-side don’t make outgoing connections, so this is used by TCP client only. When making connection, we often see error messages about server refusing connection, because no server is “accepting”.

[1] think of a foreign businessman traveling to China to build guanxi with local government officials.

 

multicast address ownership#eg exchanges

https://www.iana.org/assignments/multicast-addresses/multicast-addresses.xhtml shows a few hundred big companies including exchanges. For example, one exchange multicast address 224.0.59.76 falls within the range 224.0.58.0 to 224.0.61.255
Intercontinental Exchange, Inc.
It’s educational to compare with a unicast IP address. If you own such an unicast address, you can put it on a host and bind an http server to it. No one else can bind a server to that uncast address. Any client connecting to that IP will hit your host.

As owner of a multicast address, you alone can send datagrams to it and (presumably) you can restrict who can send or receive on this group address. Alan Shi pointed out the model is pub-sub MOM.

UDP^TCP again#retrans

http://www.diffen.com/difference/TCP_vs_UDP is relevant.

FIFO — TCP; UDP — packet sequencing is uncontrolled
Virtual circuit — TCP; UDP — datagram network
Connectionless — UDP ; TCP — Connection-oriented

With http, ftp etc, you establish a Connection (like a session). No such connection for UDP communication.

Retransmission is part of — TCP; UDP — application layer (not network layer) on receiving end must request retransmission.

To provide guaranteed FIFO data delivery, over unreliable channel, TCP must be able to detect and request retransmission. UDP doesn’t bother. An application built on UDP need to create that functionality, as in the IDC (Interactive Data Corp) ticker plant. Here’s one simple scenario (easy to set up as a test):

  • sender keeps multicasting
  • shut down and restart receiver.
  • receiver detects the sequence number gap, indicate message loss during the down time.
  • Receiver request for retransmission.