NAT (Network Address Translation) is a way to map multiple local private addresses to a public one before transferring the information. Organizations that want multiple devices to employ a single IP address use NAT, as do most home routers.
First, the protocol should be based on UDP. You can do NAT traversal with TCP, but it adds another layer of complexity to an already quite complex problem. Second, you need direct control over the network socket that’s sending and receiving network packets. Direct socket access may be tough depending on your situation. One workaround is to run a local proxy. Your protocol speaks to this proxy, and the proxy does both NAT traversal and relaying of your packets to the peer.
There are two obstacles to having NAT Just Work: stateful firewalls and NAT devices.
Stateful firewalls have limited memory, meaning that we need periodic communication to keep connections alive. If no packets are seen for a while (a common value for UDP is 30 seconds), the firewall forgets about the session, and we have to start over. To avoid this, we use a timer and must either send packets regularly to reset the timers, or have some out-of-band way of restarting the connection on demand.
For UDP, the rule is very simple: the firewall allows an inbound UDP packet if it previously saw a matching outbound packet. In other words, packets must flow out before packets can flow back in.
A NAT device is anything that does any kind of Network Address Translation, i.e. altering the source or destination IP address or port. NATs let us have many devices sharing a single IP address, so despite the global shortage of IPv4 addresses, we can scale the internet further with the addresses at hand. Multiple NATs on a single layer allow for higher availability or capacity, but function the same as a single NAT.
There are 4 types of NATs: "Full Cone", "Restricted Cone", "Port-Restricted Cone" and "Symmetric" NATs based on the matrix of Endpoint-dependent/independent firewall and Endpoint-dependent/independent NAT mapping.
For details, check out https://tailscale.com/blog/how-nat-traversal-works/
When talk about NAT or WebRTC, we always need to talk about https://www.jimzhao.us/2018/09/ice-stun-turn.html
STUN (Session Traversal Utilities for NAT)
That’s fundamentally all that the STUN protocol is: your machine sends a "what’s my endpoint from your point of view?" request to a STUN server, and the server replies with "here’s the ip:port that I saw your UDP packet coming from."
TURN (Traversal Using Relays around NAT)
The idea is that you authenticate yourself to a TURN server on the internet, and it tells you "okay, I’ve allocated ip:port, and will relay packets for you." You tell your peer the TURN ip:port, and we’re back to a completely trivial client/server communication scenario.
ICE (Interactive Connectivity Establishmen)
The protocol specifies a stunningly elegant algorithm for figuring out the best way to get a connection. For instance, two peers are on the same WiFi network, with no firewalls and no effort required.
In short, ICE is to find best connectivity path, A STUN server is used to get an external network address, and TURN servers are used to relay traffic if direct (peer to peer) connection fails. Every TURN server supports STUN: a TURN server is a STUN server with added relaying functionality built in. Authentication parameters are supported by TURN while STUN servers do not.