Sunday, September 30, 2007

Networking


How To Set Up A Linux Network


Before we get into setting up Linux networking on a Debian system, we'll cover the basics of how to set up a network with both Windows and Linux systems and how to make it a "private" network. Here the term "private" may not mean what you think it does. It has to do with the IP addresses you use on your home or business network. You'll then understand the value of having a proxy/NAT server or a firewall system which also performs the proxy/NAT function on your network.

Once we cover the "whys" and "whats" we'll get into the "hows". You'll see how easy it is to set up a home or small-business network including what hardware is needed. We'll briefly mention what you need to look at on Windows PCs and present more in-depth information on which files are used on a Debian system to set up networking. The Network Configuration Files section shows what files are involved in setting up your Debian system to work on a local network and how they need to be configured to enable the various functions involved in networking including being able to connect to the Internet.

Note: Even if you're not familier with TCP/IP networks, try giving the material on this page a shot. It's presented in an introductory manner. Don't be concerned if you don't understand all of the material on OSI layers, subnetting, etc. presented on this page. Understanding this material is not necessary when setting up a network or using the subsequent guide pages on this site. It is merely presented for those who wish more in-depth information.

Even if you don't have a network you can still play around with the material present on this page and on the Proxy/NAT and Firewall pages. See the No-Network Network section below on how to do this.
What's particularly appealing about Linux for small businesses and non-profit organizations is that you can set up both internal (file, print, database) servers, external Internet (Web, e-mail, ftp) servers, firewalls, and routers (yes, you can set up a Linux system to be a router too) for very little cost. The operating system and server applications are free and, given that Debian will run on older hardware, the hardware costs can be minimal. These attributes also make it a great toy for those wishing to learn more about networking. Pick up one CD set and you can set up all the Linux servers, firewalls, and routers you want and experiment your brains out.


"Private" Networks

Theoretically, every system on a network needs a unique identifier (a unique address). As such, every system that accesses the Internet would need a unique IP address because TCP/IP is the protocol of the Internet. However, when the Internet exploded in the mid-90s it became clear that there simply were not enough addresses available in the TCP/IP address space for every computer in every office of every Internet-connected organization. That doesn't even take into account those who wanted to access the Internet from home.

The solution was to create "private" address ranges to be used in conjunction with "address translation". Lets look at the first piece first.

Three blocks of IP addresses were set aside as private, meaning that all of the routers on the Internet would be configured to not route them. That's why private addresses are also referred to as "non-routable" addresses. The benefit? If packets from systems with private addresses weren't routed between Internet-connected networks, then a whole bunch of networks could use the same private addresses because they'd never "see" each others addresses. In other words, these same addresses could be used by any number of computers around the world because if they weren't routed, it would never be "discovered" that they weren't unique.

So if they're not routed, how do you get on the Internet if your computer has a private address assigned to it? That's where the second piece, address translation, comes in. Normally, in order for all the computers in a company to have Internet access they would all have to be assigned routable ("public") IP addresses that could pass through the Internet. Since there aren't enough addresses for this, companies instead assign all of the hundreds of computers in their organization private addresses and they all share a single "public" address to access resources on the Internet. This sharing is accomplished by configuring privately-addressed systems to use a special server, called a proxy server, to access the Internet.

A proxy server has two NICs (Network Interface Cards) because it's connected to two different networks. One NIC is connected to the Internet and is assigned a single "public" (routable) IP address. (This NIC is referred to as the "external interface".) The other NIC is connected to the company's internal network. It is assigned a private IP address so that it can communicate with all of the other privately-addressed computers in the company. (This NIC is referred to as the "internal interface".) The proxy server acts as a "gateway" onto the Internet. (Because of the gateway behavior, a proxy server should also have firewalling capabilities to protect the internal network.) However, in addition to acting as a gateway, it acts as an address translator.

The private IP addresses assigned to the systems on your internal nework are chosen by you from one of the three private address ranges listed below.

Public IP addresses are only available from an ISP. In most cases, such as with a dial-up, DSL, or cable modem, your ISP automatically assigns a single public address to your modem using PPP, bootp, or DHCP. This assigned address can change from time to time ("dynamic"). It requires no configuration on your part. Business customers typically obtain multiple public addresses from their ISP. These addresses do not change ("static"). Static addresses are needed for Internet servers that are referenced by DNS records such as Web servers, mail servers, etc. that are contacted using a domain name.
When a computer on the internal network with a private address wants to request information from a Web site, it actually sends the request to the internal interface of the proxy server. The proxy server, with it's public routable address on the external NIC, is the one that actually sends the request to the Internet Web server. The Web server sends the response back to the proxy server's external NIC, and the proxy server then forwards the response on to the computer on the internal network that made the initial request. The proxy server keeps track of which internal computers make which requests.

The advantage? Hundreds of computers in a company can access the Internet and only take up a single public Internet address (that of the proxy server's external NIC). Another advantage is security. If your computer's address can't be routed over the Internet, it would be hard for someone to get at your computer from the Internet. (There are ways though.)

The translating of a private address to a public address (outbound request) and back again (inbound response) is most commonly known as NAT (Network Address Translation). In the Linux community it's also often referred to as "masquerading" because the proxy server hides the true identity of the internal computer that made the initial Internet request.

The internationally-established private IP address ranges that can be assigned to internal network computers are as follows:

  • 10.0.0.1 through 10.255.255.254
    • 16,777,214 addresses
    • 16,777,214 computers on 1 network
      (10.x.x.x)
    • Uses a subnet mask of 255.0.0.0
    • First octet must be the same on all computers
    • A Class A address range

  • 172.16.0.1 through 172.31.255.254
    • 1,048,574 addresses
    • 65,534 computers on each of 16 possible networks
      (172.16.x.x to 172.31.x.x)
    • Uses a subnet mask of 255.255.0.0
    • First two octets must be the same on all computers
    • A Class B address range

  • 192.168.0.1 through 192.168.255.254
    • 65,534 addresses
    • 254 computers on each of 256 possible networks
      (192.168.0 to 255.x)
    • Uses a subnet mask of 255.255.255.0
    • First three octets must be the same on all computers
    • A Class C address range

Clearly, with 254 possible addresses on each of the 192.168.x.x private address ranges, using one of these ranges is plenty for most small businesses and those who want to play around with a network at home. (This is why, you may recall, the default IP address in the "Network Setup" part of the Debian installation was 192.168.1.1.) The 10.x.x.x address space is often used by very large organizations with many dispersed locations. They will "subnet" this large private address space so that one location will have an address range of 10.3.x.x, another will have 10.4.x.x, and so on, with each location having the ability to have up to 65,534 computers. Each location may even further subnet their address space for different departments or facilities. For example, in the location that has the 10.3.x.x address space, the engineering department will have the 10.3.2.x space, the accounting will have the 10.3.3.x address space, etc. with each department being able to have up to 254 computers. (You'll see where these numbers come from later.)

Each of the numbers separated by periods in an IP address is referred to as "octet" because the value of the number (0 to 255) is derived from eight binary bits. An IP address actually consists of two parts. The first part of an IP address identifies the Network that a computer is on, and the other part identifies the individual Computer on that network.

There's an often-used analogy comparing an IP address to a telephone number. The network part of the IP address is analogous to the area code, and the computer part of the IP address is like the individual's phone number. All the people on phone company network (in one area code) have the same area code number, and no two of them have the same phone number. Conversly, two people can have the same phone number in different parts of the country because they're not in the same area code (not on the same network). It's when you put the area code and an individual's phone number together that the number becomes one that is unique to the entire country (internetwork). In this context, we call the area code the prefix. With IP addresses, the network part of an IP address is the prefix.

Note: If you get into routers you'll learn about another instance where this telphone number anology holds true. The long distance carriers don't care what the individual's phone number is. All they look at is the area code and route the call to the baby bell's appropriate switching center. Likewise, most network routers don't care about the computer part of an IP address. They only look at the network part of the address because routers route packets between networks, not PCs. The only router that cares about the computer part of the address is the "final hop" router (the router which is connected to the network segment that destination computer is connected to).
With national phone numbers the prefix is always three digits. Unlike national phone numbers, the prefix of an IP address can vary in length. That's where a subnet mask comes in. It determines how much of the IP address is the prefix (the network part). If you have a subnet mask of 255.255.0.0 it means that the first two numbers (octets) in an IP address are the network part (prefix) of the address and the last two numbers (octets) are used to identify the individual computers on that network.

What a subnet mask does

Note that a '1' in a subnet mask indicates a Network bit and a '0' indicates a Computer bit. Because of the way an IP address is split into two parts (network part followed by the computer part), a subnet mask will always be a series of consecutive 1s followed by a series of consecutive 0s. In other words, you'll never see a subnet mask where 1s and 0s are interspersed (ex: 11100110).

Here's a few points you need to be aware of when you assign IP addresses to systems on a network:

  • All systems on an internal network must have the same subnet mask
  • The network part of the IP address must be the same for all computers on the internal network
  • The computer portion of the IP address must be different for every computer on the internal network
That second point is important. With a Class C network the subnet mask of 255.255.255.0 indicates the first three octets are the "network" part of the IP address. That means that the first three numbers (octets) of the IP address must be the same on all of the computers on this internal network. That leaves only the last octet available to identify individual computers on the network (8 bits equals about 254 computers that can be uniquely identified, a value between 1 and 254 for the last octet).

Note: People are often confused by the 192.168.x.x private address range because of it has two 'x's. The first 'x' means you can pick one and only one number from the range of available values (0 through 255) for your network. Because this is a Class C address which uses a subnet mask of 255.255.255.0, the number you choose for the first 'x' must be the same on all systems on your network.

You can set up a network with 192.168.1.x addresses and another network with 192.168.2.x addresses but because these are Class C address ranges they are two totally separate networks. You would need to set up a router to inter-connect the two networks in order for the machines on both networks to talk to each other. If you think that, given the number of servers, workstations, network printers, switches, etc. you may have in the forseeable future may grow to more than 254, you may want to consider using the Class B private address space on your network.
The range of values for the last octet (to assign to computers) on a Class C network is 1 to 254. So you can only have 254 computers on a network that has a subnet mask of 255.255.255.0 (a Class C network). A '0' in the last octet is reserved as the "wire" address for the network and 255 in the last octet is the broadcast address for the network. No system can be assigned either of these numbers. The following addresses can't be assigned to computers when using the private IP address ranges:

"Wire" or "Network" Addresses:
10.0.0.0
172.16-31.0.0
192.168.0-255.0
Broadcast Addresses:
10.255.255.255
172.16-31.255.255
192.168.0-255.255

Because you can't use these two (wire and broadcast) addresses to assign to computers, the number of addresses that you can assign to computers can be calculated using the equation:

2c - 2 = number of available computer addresses

where c is the number of computer bits in your subnet mask. Now you can see how we came up with the number of computers you could have on each network when we gave the three private address ranges earlier. With a class B address you have 16 computer bits in the subnet mask:

What a subnet mask does

so using our equation we end up with

216 = 65,536 - 2 = 65,534 available computer addresses

On a class C network we have 8 computer bits in the subnet mask so:

28 = 256 - 2 = 254 available computer addresses

Earlier we said that private address ranges and NAT help conserve IP addresses but the use of subnet masks can also. For a more thorough explanation on IP address classes and the use of subnet masks check out the Subnetting section later in this page.

You may run into networks in businesses and schools where the computers don't have private addresses. Our local community college is one such case. That's because they jumped on the Internet bandwagon early when public addresses were being handed out for the asking and got themselves a Class B public address (which allows them to have 65,534 publically addressed computers). Every PC at all the campuses has a publically routable address. The upside is they don't need to use a proxy/NAT server in order for the students to be able to access the Internet. The downside is that their internal network is basically a part of the Internet. This is a huge security risk if no firewall is in place because every system is accessible from anywhere in the world (which is why they have a mega-bucks Cisco PIX firewall appliance).

If you're using a networked Windows system at the office and you want to see what your system's TCP/IP settings are, open up a DOS window and enter the command:

ipconfig

To view the TCP/IP settings configured for a NIC on a Linux system type in the following command at the shell prompt:

ifconfig

and look for the "eth0:" entry.

What Network Hardware Does

Now that we've covered the "logical" part of networking, lets take a look at the hardware. When you talk networking hardware you're talking NICs, hubs, switches, routers, cables, etc. However, hubs are quickly disappearing, being replaced by switches. When switches were first introduced, they were significantly more expensive than hubs on a "per port" cost basis, but their prices have dropped to the point where the cost difference isn't an issue.

Why are switches better than hubs? When two PCs want to talk to each other, switches set up a virtual direct connection between two. This means the bandwidth isn't shared with other systems. The two PCs get it all (100 Mbps). They accomplish this by using the MAC addresses of systems on the network.

Each system on a network actually has two addresses. A "logical" IP address that can be assigned and changed by you at will, and a "physical" hardware address that is permanently burned into a NIC. This permanent hardware address is often referred to as a MAC address or "BIA" - Burned-In Address. (Physical, hardware, MAC, and BIA all refer the same address.) Each "packet" sent over a network must include both the logical and physical addresses to identify the destination computer.


Technically, using the term "packet" when referring to what's sent across the wire isn't proper. A packet has a very specific meaning in the TCP/IP world. The term "frame" would be more appropriate but it's become rather common to mis-use the term "packet" in this way, such as in "packet sniffers", etc. And actually, nothing is "sent" across a network cable. The voltage on the network cables just flucuates rapidly and, like the telegraph operators in the old west who knew Morse Code, every device attached to the same LAN cable segment senses and monitors this series of flucuations to see if they pertain to them (as indicated by the destination MAC address). If you're familier with the OSI model, these voltage fluctuations are what's referred to by "Layer 1". It's the NIC's job to convert a frame into a series of voltage fluctuations when sending and to use these voltage flucuations to rebuild a frame when receiving.


The encapsulation scheme was devised because it offers flexibility. One layer's packet is the next lower layer's payload. Packets that travel over a LAN are encapsulated in an ethernet frame. If you're connecting to another network using a modem, your computer encapsulates the packets in a PPP frame. The Layer 2 framing is dependent on the connection technology you're using. When Cisco routers are used to connect remote sites they can use any number of different technologies with their related framing protocols including PPP, frame relay, HDLC, etc. Likewise, the Layer 3 protocol could be IPX on a Novell network rather than IP. In such a case the Layer 3 header would contain source and destination IPX addresses (and Novell's SPX protocol would take the place of TCP).

The sending computer takes your data (your e-mail message or Web site request) and does the series of encapsulations based on the type of network you're on. The receiving computer simply reverses the process and de-encapsulates the received frame to extract the data you sent.

One other point to note in the above diagram is that the destination IP address is not the only TCP/IP-related information specified about the destination computer. The TCP port number for the requested service is also specified. HTTP (Web) uses port 80. FTP (file transfer) uses port 21. You may have seen the term "socket" used in TCP/IP literature. A socket is simply referencing the IP address and port number together. For example, traffic destined for a Web server with an IP address of 172.16.5.20 would have a destination socket of 172.16.5.20:80. It's the use of different port numbers (and sockets) that allows you to still surf the Web while you're using FTP to download a file.

So how does the sending system find out what a destination computer's MAC address is so that it can put it in a frame? It uses ARP. ARP is an address resolution protocol kind of like DNS. Whereas a computer will send out a DNS query packet to a DNS server to resolve a domain name or system name to an IP address, it will also send out an ARP query packet to resolve an IP address to a MAC address. An ARP request is basically just a broadcast that says "Whoever has IP address x.x.x.x, send me your MAC address". We mention this because when comparing hubs to switches to routers you can see where these two different destination addresses come into play. (You'll see how all this ties together in the How It All Works section later in this page.)


Network
Device
Packet
Handling
OSI
Layer
Hubs Hubs don't look at anything. They serve as a central electrical tie point for all the systems on a network and are basically just repeaters. Any packets received on one port are flooded out all other ports. All systems on the network see all traffic and all share the available bandwidth. As a result, only one computer on the network segment can transmit packets at a time (shared bandwidth). 1
Electrical
signals and
connections
Switches Switches also serve as a central electrical tie point. They don't care about IP addresses. They look at the MAC addresses in packets and build a lookup table that shows which MAC address is plugged into which switch port. This allows them to forward packets MAC-addressed for a specific system to that system only (the virtual direct connection). None of the other systems on the network segment see them. (If it doesn't have the destination MAC address in it's ARP table it floods the packet, either waiting for the destination system to respond so it can add an entry into its ARP table for it, or it lets the default gateway router deal with it.) As a result, multiple computers on the same network segment can transmit at the same time (bandwidth is not shared).

Another advantage of switches is that the virtual direct connection they set up between two systems allows for full-duplex communications between the two systems. Full-duplex operation has the effect of doubling the bandwidth under certain circumstances because the both systems can send and receive packets to each other simultaneously (provided both NICs support full-duplex operation).

There is a disadvantage to switches though. You can't hook a system running a "packet sniffer" to a switch to try and analyze your network's level of bandwidth utilization or capture traffic to/from a specific system because the sniffer system can't see all the traffic. It will only see traffic sent to it and any "broadcast" traffic (such as ARP requests) which is flooded out all switch ports. (Knowing the level of broadcast traffic on your network is helpful however.) You can only sniff traffic to/from other systems when they (and the sniffer system) are connected to hubs. However, Cisco switches have a feature called SPAN that lets you configure a port as a SPAN port to hook a sniffer to. You then specify one or more other ports on the switch which will have their traffic duplicated to the SPAN port. You would specify two ports to monitor the traffic between two systems or all other ports to simulate having your sniffer hooked to a hub. This could slow down a heavily utilized switch though as you would in essence be doubling the amount of traffic across the switch fabric because all unicast traffic is going to two ports instead of one.
2
Hardware
addressing
(MAC address)
Routers Routers do switching too, building a lookup table of the MAC addresses of systems on the network segment that connects to each interface, but they also look at IP addresses. If the destination IP address of a packet isn't on the same network segment as the system that sent the packet, the router looks at a different table (routing table) to see which other interface it needs to send the packet out of to reach the destination network. That's why default gateway devices are routers.

Often times a router will connect to two different types of networks. One interface will connect to an ethernet LAN while another connects to a frame relay WAN link. If the router receives a packet on the LAN interface that needs to be forwarded onto the frame relay link, it will strip off the ethernet frame and re-encapsulate the packet in a frame relay frame before sending it on its way.
3
Logical
addressing
(IP address)


Switches and routers aren't the only things that build ARP tables. Computers build them too, but the table entries are temporary (called the "ARP cache"), usually being removed after a couple minutes. On a Windows system, open up a DOS window. Linux systems just do the following at the shell prompt. Ping another system on your network. When the ping is finished (on Linux systems press Ctrl-C to stop the pinging), type in the following command to view the ARP table entry created as a result of the ping:

arp -a

Given that most broadband Internet connections are only in the 1 Megabit-per-second (Mbps) range, you probably wouldn't notice much of a performance difference between using a hub and a switch on a home network. In this case you could save some money by buying a used 8-port 10 or 100 Mbps hub on places like eBay rather than a new switch. However, on larger business LANs the performance difference could be significant.

Notice one very important point. An ARP is a broadcast packet. As such, only systems on the same LAN segment will see the broadcast and respond. In other words, ARP will only resolve the MAC addresses of destination systems on the same Layer 2 segment (which is comprised of systems interconected by hubs or switches).

How does your system know when to try and get the destination system's MAC address? Easy. It looks at the destination IP address. If the destination IP address isn't on the same subnet (doesn't have the same network portion of the IP address), then it knows sending an ARP request with the destination's IP address won't do any good. So instead, it sends out an ARP request with the IP address of the default gateway. Your system knows that to access any system on a different network (or subnet), it has to send the packet out the default gateway router so it ARPs to get the MAC address of that router's interface. (That's why you enter an IP address, subnet mask, and default gateway address in your system's TCP/IP configuration.)

Let's take a look at a somewhat realistic frame. You're on your PC at work which is connected to an Ethernet LAN and you send your mom an e-mail. Remember, when sending information the encapsulation process starts at the top OSI layer (Application layer) and works it's way down through the transport, network, and link layers. (Received frames go up the OSI layers and get de-encapsulated at each layer.)

  • Since e-mail messages are sent using the SMTP protocol, your e-mail message will get wrapped in a TCP segment with port 25 as the destination port.
  • Then, since your mom's e-mail address contained the domain name of her ISP, your system has to send out a DNS request to get the IP address of the mail server for that domain name. Once it has that it will use it as the destination IP address as it wraps the TCP segment up in an IP packet.
  • Because the source and destination IP addresses are for different networks, your system will send out an ARP request to get the MAC address for the default gateway router's interface. With that it can wrap the IP packet inside an ethernet frame, calculate the checksum value for the FCS (Frame Check Sequence), and pass it down to the hardware layer to be put on the wire.

Note that your system had to do two address resolution queries (DNS and ARP) to get the two destination addresses (IP and MAC). Only after it has both of them can it build the frame.

At some point the packet will likely encounter your proxy server and it will change the source IP address of the packet to the public address of the external interface of the proxy server. The proxy server will use the randomly generated source port number (58631) to keep track of your PC's SMTP "conversation" with the mail server.

As your e-mail packet travels over the Internet the frame type will change to whatever medium (frame relay, ATM, etc.) is encountered along the way (the intermediate hop routers de-encapsulate and re-encapsulate with the proper frame type as necessary because routers can connect to different network mediums). But once the proxy server has changed the private IP address to a public one and sent the frame on its way, the IP addresses in the packet (and any information in the layers above it) never change as it winds its way through the Internet. Only the Layer 2 framing changes.

Note: With point-to-point serial links like a dial-up connection between two modems (PPP Layer 2 protocol) or a leased line between two routers (HDLC Layer 2 protocol) there is no need for Layer 2 addressing. As an analogy, take the old George Carlin line "When there's only two people on an elevator and one of them farts everybody knows who did it". If there's only two devices connected to a link and one of them is sending they both know who needs to be receiving. The packet is still encapsulated in a frame appropriate for the medium, but there is no addressing information in the frame header.
Going back to the OSI model for those studying networking, you can see how the pieces of the above frame relate to the lower layers of the model. It also shows how all the different technologies relate to each other.

Note that a NIC is both a Layer 1 device (because it has a 10/100-Base-T connector) and a Layer 2 device (because it looks at the MAC addresses of the frames it sees). Same for switches because they have 10/100-Base-T connectors and they look at the MAC address to build their MAC address-to-port lookup table). Routers can have multiple interfaces of varying types (serial, 10/100-Base-T, Token Ring) so it is a Layer 1 device (different connector types), has to appropriately re-frame a packet for each of those network types and apply the appropriate Layer 2 address (so it is also a Layer 2 device) and it looks at the IP addess to see which interface the packet needs to go out of (so it's also a Layer 3 device).

Also note that the Layer 3 Network technologies (IP, IPX, etc.) are totally independent of the Layer 2 Link technologies which is why you can have a TCP/IP network span different network types or run Novell IPX/SPX on an Ethernet or Token Ring network.

There's one important point about Layer 4. Note that it's the server applications that open ports. A common security practice is to run a "port scanner" program on a workstation and point it at a server to see what ports are open on that server (to see which ports the server is listening on). If ports are open that shouldn't be, it's because some application or memory-resident service opened the port and is listening for service requests on it. To close ports you need to find out which application or service has them open and shut them down as well as keep them from starting when the server is booted. As you will see in the Internet Servers page, applications are often started at bootup through inetd.

The above was a simplified coverage of frames and packets. There's more than just addresses in frame (Layer 2), packet (Layer 3), and segment (Layer 4) headers. There are flags, status indicators, sequence numbers, etc. And we didn't show the information in the layers between the Transport layer (TCP ports) and the Application layer (the data). For example, NetBIOS and client/server technologies like SQL would use the "Session" layer above the Transport layer.

Time spent learning the OSI model is an investment that will likely pay off in big time savings when troubleshooting network problems. The layers allow you to narrow down and isolate the possible sources of problems. As a simple example, take a Web server that isn't responding to requests from browsers. If you can ping the server, which uses Layer 3 ICMP packets, you know the network configuration is OK but for some reason the Apache Web server software isn't listening on port 80 (Layer 4). Likewise if two Windows systems running NetBIOS over TCP/IP can't "see" each other in Network Neighborhood but they can ping each other you know the TCP/IP properties are OK and it's likely a Session layer problem. Session layer problems could be something with the Windows Networking configuration (ex: File and Print Sharing isn't enabled or the Server service isn't running - neither of which is needed for TCP/IP at layers 3 and 4 to operate properly).

The Two Parts of Ping

Be careful about your assumptions when using the ping command. It has two parts The ECHO (what you send) and the ECHO REPLY (what the remote system sends back).

It is easy to assume that if you don't get a response to a ping there is no connectivity between the two systems. This is not necessarily true. The ping ECHOs may very well be reaching the remote system. However, if the remote system isn't configured correctly, you may not receive the ECHO REPLYs.

The most common reason for this is that the remote system doesn't have a route in it's routing table to the network your system (the ECHO sending system) is on. As a result, the remote system is trying to reply but either it doesn't know where to send the replies, or it is sending them but out of the wrong interface (which can happen if the replying system has multiple NICs, but remember that your system sees a connected modem as the ppp0 interface, essentially another NIC).

Causes for this are incorrect values for the default gateway, subnet mask, or IP address on the remote system (i.e. an incorrect Layer 3 configuration). Also consider the fact that the incorrect values could be on the ECHO sending system. In this case, while there may be connectivity between the two systems, the ECHOs are never being sent or are being sent out of the wrong interface. Although Layer 3 may be configured incorrectly, there is still may be traffic coming from the system. Look at the port indicators on your hub or switch (for ethernet interfaces) when you ping to check for the existence of Layer 1 signals indicating ping traffic. Likewise, the indicators on an external modem can be used to see if ping attempts are being sent or received at the lower layers with dial-up connections.

If you're trying to ping a system on a remote network it's also possible that a router or firewall somewhere between the two networks is blocking one or both of the ping components or the router/firewall itself has an incomplete routing table.

Don't get hung up on the word "protocol" that you see mentioned a lot in networking. It's just a technical term for "a set of rules or commands". The PPP protocol dictates that if you're going to put a PPP frame on the wire the frame has to be comprised of these fields in this order with these lengths and each field has to contain this information. The SMTP protocol dictates that if you're going use the SMTP service on a mail server to send a mail message you have to use these commands with these parameters in this order to set up the connection, specify the sender, specify the recipient, etc. Defining a protocol is just a means of establishing a standard. If everyone builds their hardware or writes their software to follow the same rules then all the pieces work with each other.

Debian comes with two great "sniffer" packages that allow you to look at individual packets. tcpdump is a console program that will display packet traffic in a real-time manner, with the packets scrolling up the screen as they are received and displayed. Ethereal is a sophisticated GUI protocol analyzer that breaks the frames down into their individual OSI layers and displays the data contained in each of the frame, packet, and segment fields. I use a dual-boot notebook at work with Fluke's expensive (read 'thousands of dollars') "Protocol Expert" sniffer installed on the Windows 2000 partition and Ethereal installed on the Debian partition and the differences in the information provided by both is very small indeed.


Where to learn more - The best of our bookshelves:

First Year Companion Guide
More info...

If you're interested in really learning how networks work and the TCP/IP protocol, Cisco's First Year Companion Guide is THE book for you. It's one of my all-time favorite networking books. It really makes the light bulbs come on. It's actually the first-year text book for the Cisco Networking Academy program but the first half of the book (first semester of the program which is Chapters 1 through 15 - in the book) deals entirely with the "basics" of networking. It has one of the most thorough presentations of the OSI model I have ever seen in any book, and understanding the functions of the various layers in the OSI model is understanding how networks work. TCP/IP, address classes, encapsulation, the functions of switches and routers, and how to subnet a network are all covered. They even get into security, network management, and home networking later in the book (along with some basics on router programming so you can see how that's accomplished). A strong foundation in the OSI model is essential if you're going to be good at network troubleshooting and this book will get you there. The book gets some bad reviews on Amazon due to some typos and misplaced diagrams. But the fact that you can easily identify a typo or misplaced diagram indicates you understand the material. (Be sure to get the newer 2nd Edition which fixes many of these mistakes.)


And as long as we're on the subject of cool Debian packages and verifying connectivity, one package that's probably already on your system as part of the default installation is the mtr utility. It's like the traceroute command, only better. If your Debian system has Internet connectivity and you have the correct DNS server settings in the resolv.conf file, type in the following at the shell prompt:

mtr www.debian.org

It's a great tool for real-time monitoring the effects of any changes you make to server, proxy, firewall, router, etc. configurations. Press Ctrl-C to quit the program.

Sometimes you'll make a change and then you don't get the results you expected. It may be because systems cache certain "reachability" information. We said earlier that arp maps IP addresses to MAC addresses (layer 2). Systems also maintain routing tables which relate an interface to the path needed to get to a given IP address (layer 3). Often times you will need to clear these caches to get the results you expect.

Function Linux
Command
Windows
(DOS)
Command
View
ARP
cache
arp arp -a
Clear
ARP
cache
arp -d
entry
arp -d
View
routing
table
route route print
Clear
routing
table
route del
entry
route -f
View
DNS
cache
n/a ipconfig
/
displaydns
Clear
DNS
cache
n/a ipconfig
/
flushdns

Note that there are no Linux versions of the DNS commands. Windows caches a lot more things to try and improve its performance so if you're using a Windows system to test DNS changes be sure to clear that cache also. Also note that you can't flush all entries from the arp cache or routing table with Linux. It has to be done one entry at a time.

What A Subnet Mask Really Does

So now that you know about ARP and MAC addresses lets go back to subnet masks for a minute. We know that a subnet mask identifies the network portion of an IP address, and that you have to enter a subnet mask on every system on an internal network and it has to be the same on all those systems. But how does a system make use of it? It has something to do with the default gateway address you also enter on most systems.

When a system wants to send a packet to another system, it takes the IP address of the destination system and compares it to the subnet mask. If the network portion of the address is the same it knows the destination system is on the same network (segment) it is. So it sends out an ARP request which in effect says "System with IP address x.x.x.x send me your MAC address." When it gets the reply it puts the destination system's MAC address in the frame header and puts it on the wire.

If the system compares the destination IP address to the subnet mask and finds that the destination system is on a different network it still sends out an ARP request, but this time the ARP request contains the IP address you entered for a default gateway (typically a router). When the router responds to the sending system, the sending system puts the routers MAC address in the frame header (Layer 2) and sends the packet to it and it lets the router worry about the Layer 3 stuff. As a result, you can see that the only Layer 3 function performed by end systems is to find out if a destination system is on the same network (subnet) or not using the subnet mask (and getting the end system's IP address in the first place). If it's not, it sends it to the default gateway and lets it worry about getting the packet to the different network. (But how does the sending system get the destination system's IP address in the first place? That's why you also have to enter DNS server addresses on every system. You'll see how they use these addresses on the DNS Services page.)

Given all of the above you can figure out what problems you'll have if you have an incorrect subnet mask on a system. Lets use the Class B network 172.18.0.0 as an example. Recall that the correct subnet mask for a Class B network is 255.255.0.0 so all systems on the same network segment have addresses that start with '172.18'.

If a system on this network has an IP address of 172.18.5.10 and an incorrect subnet mask of 255.0.0.0 it'll think that any system where the first octet is 172 is on the same segment it is. So while a packet destined for the system 172.16.2.12 is on a different segment, the system with the incorrect mask will think it's on the same segment and will never try to use the default gateway. It'll just sit there and ARP its brains out trying to get a response from 172.16.2.12 with its MAC address. Eventually it'll error out and say that the system is unreachable.

Going the other way, if the Class B system (172.18.5.10) has a Class C mask of 255.255.255.0, it'll think that only systems with addresses that start with '172.18.5' are on its local segment. So if it has a packet destined for 172.18.3.33 (which is on the same segment) it'll think it's on a different segment and send an ARP request with the IP address of the default gateway router and send the packet to it, even though it should have just sent an ARP request using the IP address of the destination system itself. In this instance, the router may be smart enough to just send the packet back onto the same segment to the destination system or it may just drop it. It depends on how the router configured. And if the router is smart enough you may not notice there's a problem. But having a "narrower" subnet mask than you should would place an undue load on your gateway router and unnecessary traffic on your network.


At The Office

The wide range of addresses available with the 172.x.x.x space allows you to simplify things in larger networks. You pick one and only one value for the first 'x' (the second octet) from the available range of 16 through 31, but you can use different values for the third octet to try and keep track of things.

For example, lets say you decide to use 172.18.x.x for your network. You could manually assign all of your servers addresses in the 172.18.1.x range. All of your printers could be manually assigned addresses in the 172.18.2.x range, and you could set up a DHCP server on your internal network to automatically assign addresses in the 172.18.3.x and 172.18.4.x ranges to your workstation computers. (In DHCP lingo, the address spaces you set up for the DHCP server to hand out individual addresses from are called "scopes" and the addresses are "leased" to workstations.)

Even though the systems in your network would have a value for the third octet ranging from 1 to 4 in this example, because you'd only be using a Class B subnet mask (255.255.0.0), they would all still be on the same network. You're still using two octets to identify individual computers instead of just one octet as with a Class C address. It's just that you're using the third octet as a means of logically grouping or identifying devices.

Note: Using different numbers in the "computer" portion of address as in the previous example is not the same as the "subnetting" we referred to when discussing the 10.x.x.x network address space earlier. While a subnetted Class B network would likely have different values for the third octet on each subnet, simply assigning certain a certain group of computers the same value for the third octet in a Class B address range does not create a subnet. Subnetting is done with routers to reduce the size of broadcast domains which has the effect of increasing available LAN bandwidth. (Subnetting is covered in more detail in the Subnetting section on this page.)
If you want to allow your internal private network to access the Internet you can set up a broadband connection to the Internet. This isn't much different than getting cable or DSL access at home. You would then set up a proxy/NAT server to allow everyone on the internal network (with their non-routable IP addresses) to share this broadband Internet connection. Most ISPs require businesses to get a "business account". However, most business accounts have special features that make them preferable.

For one, check to see if the business-account broadband connection is symmetrical (i.e. the speed of the connection is the same in both directions). Home accounts have an asymmetrical connection where the download speed is much faster than the upload speed. The other important feature is that most business accounts include several static public IP addresses that you are assigned. These two features are necessary if you want to set up your own Web and/or e-mail servers (as you will see on the Internet Servers page). The cable or DSL interfaces for most home accounts are dynamically assigned temporary public IP address. (A comparison of DSL and cable services is given later on this page.)

Keep in mind though, that security becomes a major issue when you start connecting business networks to the Internet. Ideally you'd want to put any Web and/or e-mail servers in a "DMZ" with one firewall between them and the Internet. You'd then put a second firewall between the DMZ and your internal network to further restrict traffic. A DMZ is also referred to as "screened subnet". It's also a good idea to put an IDS (Intrusion Detection System) in your DMZ.

A DMZ configuration

We get more into how to set up a DMZ using Linux systems for the outside and inside firewalls on the Firewall page. We also show you how to set up and test a Snort IDS system on the Security page.

Check out our new Network Monitoring page !


At Home

Setting up a network at home is easy. Hardware-wise you just need to install some 100 mega-bit PCI NICs (Network Interface Cards) in the PCs, buy an 8-port 100 mega-bit switch, and the RJ-45 Cat 5 UTP cables to connect the PC NICs to the switch. (Get an 8-port instead of a 4-port switch. They don't cost that much more and it's always better to have too many network connections than not enough.) For home networks a hub would work just as well as a switch, but not too many companies make hubs anymore because switches have dropped in price so much. Some computers come with NICs integrated into the motherboard. Check the back of your computer for an RJ-45 jack. It looks like a telephone jack but has 8 contacts instead of the 4 contacts with telephone jacks.

If some of your systems are scattered or you don't want to run cables to them you could always add a wireless NAP (Network Access Point - which is like a wireless hub that would plug into a port on your switch) to your network but this would also require getting wireless NICs for those PCs. They also make PCMCIA versions of NICs so you can connect your notebook into your home network. Once you've got the hardware in place, you just configure the network settings in the PC software. We'll cover the Debian network configuration below. On Windows PCs, go to
Start/Settings/Control Panels/Networking
and get into the TCP/IP properties for the NIC where you can enter an IP address for the PC, a subnet mask, and possibly a default gateway (if you have a broadband Internet connection). When you enter an IP address into the TCP/IP properties like this it's called a "static" IP address because it won't change unless you change it.

The family in the diagram below chose a Class C private address range of 192.168.5.x

Setting up a home network

When you set the TCP/IP properties you are assigning the private addresses to the NICs inside the computers. You can have a NIC with a private address assigned to it on your home network and still use a dial-up modem connected to the same system to get on the Internet. Your modem gets a different (public routable) IP address from your ISP every time you dial in and retains it as long as you're connected.

The names above are not just identifying family members. They are the names given to the systems (hostnames). Since a home network is too small for a DNS server, you can set up name resolution using a simple hosts file. A hosts file is just a simple text file you can edit using any text editor (like NotePad on Windows systems). Note that this file does not have an extension. On most Linux and UNIX systems the path to open the file is:

/etc/hosts

On Windows systems the paths are typically:

95 / 98:
C:\Windows\hosts
NT / 2000:
C:\WINNT\system32\drivers\etc\hosts

Once you open the file in a text editor, add the address/hostname pairs for the systems on your network. For the above home network the hosts file would look like this:

127.0.0.1
192.168.5.10
192.168.5.11
192.168.5.12
192.168.5.13
192.168.5.14
192.168.5.15

localhost
dad
mom
grandpa
junior
sis
fido

(That's a Tab character between the address and hostname.) Each system that would want to do name resolution would need to have this hosts file but once you create one you can just copy it to other systems. Then instead of entering the command:

ping 192.168.5.12

you could just enter:

ping grandpa

If Grandpa had a Debian system running the Apache Web server software, Junior could open up a Web browser on his system and enter the following URL to see Grandpa's Web pages:

http://grandpa/

The downside of hosts files is that you have to have one on every system on your network and if a change needs to be made it needs to be made to all of the files manually by editing them with a text editor. As an alternative to setting up a hosts file on all the computers you could set up a DNS server for your LAN. This DNS server would also have the ability to resolve Internet host/domain names if you have a broadband connection to the Internet. Actually, you don't need a separate server. You can just run the DNS service on your existing Debian system. We show you how it all works, and how easy it is to set up, on the DNS page.

If you have cable or DSL service at home, you can share this broadband connection with all of the computers on your home network. Cable or DSL providers typically give you the option of having an external cable or DSL modem, which then connects to your computer's NIC via a Cat 5 LAN type cable or connects to your computer's USB port, or an internal model that gets installed in one of your computer's slots. Stay away from anything USB and get the external model that plugs into a NIC. That way, instead of plugging the cable into your computer's NIC you can plug it into a device known as a cable/DSL router. This cable/DSL router would then plug into the hub or switch that you use to connect all of your computers together. (Some cable/DSL routers have a built in 4-port hub.) A cable/DSL router is basically just a "proxy-server-in-a-box".

The "modem" you get from your cable or DSL provider is actually just a device called a "bridge". It bridges traffic from one type of network (cable or telephone) to another (ethernet-based LAN) without doing any packet inspection, filtering, or manipulation. It is essentially a media converter.

Note that a cable/DSL router is not the same as the cable or DSL "modem" or "interface" that you get from your cable company or ISP. The router allows you to take the incoming Internet connection, which normally goes to a single computer, and plug it into a hub or switch so that the Inernet connection can be shared by your entire internal network. Some can even be set up to act as DHCP servers to hand out private IP addresses to your home PCs each time you turn them on. If you don't want to assign static addresses to the PCs on your home network you just set them up to use a DHCP server and then enable the DHCP function on the cable/DSL router.
Your cable or DSL service provider assigns you a single public (routable) IP address. If you don't have a cable/DSL router, and because of the bridging action of the modem they give you, it is your PC which receives this public, routable IP address. As a result, only one PC can access the Internet. A cable/DSL router performs the NAT function explained earlier which allows this single public address (which is assigned to the router rather than a single PC) to be shared by multiple systems on your LAN.

Note in the above diagram that if you use a commercial cable/DSL router you may need one of two things (it depends on the model you buy). The cable that connects the router to your hub or switch may have to be a Cat 5 "crossover" cable, or you may have to use a regular Cat 5 LAN cable and plug it into the "uplink" port found on most hubs/switches. If you try using a regular cable but the "link" indicator on your hub/switch don't light up for that connection, you'll likely have to do either of the above. (If your hub or switch doesn't have an uplink port, you'll have no choice but to get an uplink cable.) None of this applies if you are using a Linux system as a proxy server.

IMPORTANT: Because Windows enables file and print sharing by default, having a publically routable IP address assigned to your Windows PC is a security risk. The file and print sharing functions open ports on your system and the public address makes your system accessible from anywhere on the Internet. So even if you don't need to share your broadband Internet connection with other PCs (i.e. the cable or DSL modem now plugs directly into your PC), you should either disable file and print sharing or have some kind of NAT or firewall system (cable/DSL router or Linux system) between your PC and the modem to protect your PC. (You could disable these sharing functions but this would negate the point of having a home network in the first place - to share files, printers, etc.)
In the above diagram, the public IP address dynamically assigned by your ISP is on the router's "external interface" (connection), and you assign a static private IP address to the "internal interface". Note that the "default gateway" setting on all of your home network PCs is this same static private address you assigned to the internal interface of the router.

The default gateway setting on a PC is kind of a "last resort" setting. (Cisco routers actually refer to the default gateway setting as the "gateway of last resort".) It basically tells the computer that if it wants to contact another computer, and it can't find that other computer on the local network (which Internet servers won't be), it should send the traffic for this other computer to the address of the default gateway. The default gateway device is typcially a router, so it will know what to do with any traffic that's destined for a distant network (which is why they call it a "gateway"). So in addition to doing address translation, a commerical "proxy-server-in-a-box" or Linux proxy server is also a router which acts as a gateway.

Cable/DSL routers are available from manufacturers such as Linksys and DLink for under $100. Here's an example list using mostly Linksys equipment of what you'd need to set up a home network where you can share a broadband Internet connection. (You may want to spend a little more and get 3Com 3C905 NICs because they have wide driver support.):

  • An 8-port 10/100-mb Switch * - approx $40
  • A Cable/DSL Router for Internet access sharing * - approx $50
  • One 100-mb NIC for each computer - (get used 3Com 3C905C NICs on eBay)
  • One PCMCIA NIC for each notebook - approx $40 ea
  • One Cat 5 LAN cable for each computer/notebook (variable lengths) - approx $15 ea

    * You could replace the switch and router with an
    integrated model but that limits your options for
    cable runs, experimenting, and troubleshooting.

For adding wireless nodes:


So to network (wired) four desktop PCs at home without a broadband Internet connection you're looking at approximately $175 (switch, NICs, and cables). Around $250 (additional cable and router) if you do have a broadband connection to share. This can always be added on as a separate piece later if you don't have a broadband connection now.

Alternative: Instead of buying a commerical cable/DSL router product you could just use a Linux box to act like one. Since a cable/DSL router is nothing but a proxy-server-in-a-box, you could set up a Linux system to be a proxy server and it's easy to do. (You'll see how easy on the Proxy/NAT page.) The advantage of using a Linux system as your cable/DSL router is that you can customize the firewalling capabilities of the router using something called IPTABLES. This not only includes customizing what traffic you allow into your network, but also allows you to restrict which systems on your network can access certain services or sites on the Internet. (You'll see how to do this on the Firewall page.)

How To Share An Internet Connection with a Linux Firewall / Proxy Server

Another advantage of using a Linux server over a commercial cable/DSL router is that you could also use it as your own home Web and e-mail server using the dyndns.org dynamic DNS service. We cover this on the DNS Services page. And since you've got a Web server running, you can easily add a Web cam to it so you can keep an eye on your home from anywhere that has a Web browser. We show you how to add that on the Web Cam Server page.

The concept of sharing an Internet connection is the same no matter what type of Internet connection you have. With a Linux system acting as a proxy server or firewall, you can just as easily share a modem connection.


Sharing A Modem Connection

This is essentially the same type of configuration you have if you use the "Internet Connection Sharing" option found on Windows systems. See the Modems page for information on getting a modem to work on your Debian system. Then see the Proxy/NAT page for a script that you can use to get your Debian system to act as a proxy server.

One important point to keep in mind when setting up a gateway type of system (which includes proxy servers and firewall systems) is that the "internal" interface should NOT have a Default Gateway entry. Leave it blank. The Default Gateway entry for the "external" interface will either be automatically assigned when you connect to your ISP or should otherwise be available from them.

So the proxy server that sits between your private network and the Interent actually has (or should have) three functions. A gateway (router), an address translator, and a firewall. The firewalling capabilities of the commercial cable/DSL routers are limited at best and typically not upgradable if a new sort of attack is developed. We'll show you how to set your Debian system up to protect your network on the Firewall page.

Note: When we use IP addresses in our examples on these guide pages, they will be in the private IP address ranges even if it's an example where a public address would normally be needed (such as with an Internet-connected interface). We use addresses like this so we're not using IP addresses that have been legally licensed to companies or ISPs.
If you also want your Debian system to act as a DHCP server handing out addresses to the PCs on your home network when they boot up, you can easily accomplish this. However, we won't cover it here. Based on the information we've given so far on searching for and installing packages, this would be a good one for you to try on your own. If you do set it up, be sure to set the PCs on your network to use a DHCP server. In Windows, this is modifying the TCP/IP properties for the LAN interface to "Obtain an IP address automatically".

If you are going to connect to your ISP's cable or DSL modem to the "external" interface (NIC) on your Debian system, you'll likely need to run some sort of DHCP client software on the system so that it can pull the ISP-assigned address and apply it to the external NIC. This is because it's not the cable or DSL modem that gets assigned an IP address. A cable or DSL modem merely acts as a bridge. Whatever device is connected to the customer side of the modem (PC, router, etc.) is what actually gets assigned an IP address. We likewise won't cover installing a bootp or DHCP client here. See the Web HOWTO documents for cable or DSL modems. (The reason that no DHCP client software is necessary if you're using a dial-up modem connection is because the dynamic addressing is handled by the PPP protocol.) See the Web HOWTO documents for cable or DSL modems.


A No-Network Network

On the Proxy/NAT and Firewall pages we'll show you how to configure your Debian system so you can share your broadband connection with all of the computers on a network, including a home network. However, you don't need a formal network or even a broadband connection if you want to try setting up the proxy and firewall functions demonstrated on these pages.

For those that don't have a home network or access to a business network you can still try the examples we present on these pages by setting up a two-system network (your Debian system and a Windows PC for example). All it requires you to do is to have a NIC installed and configured in each computer, and to purchase a special cable called a "Cat 5 cross-over" cable. (As mentioned above, your computers may have network cards inegrated into the motherboard already. Check the back of the systems for an RJ-45 jack.) If one of the computers you use is a notebook, and it doesn't have an integrated NIC, you'll need to get a 100 mb PCMCIA card. You can get those used for about $20 on eBay or similar sites.

A cross-over cable flips the transmit and receive pairs of the cable so they are on different pins on the connectors on each end. This allows you to connect the two systems in a back-to-back fashion so that you don't need an ethernet hub or switch. Note that you would have to have had the NIC installed in the system and installed the appropriate NIC driver module during the Debian installation to be able to use the NIC.

Networking Two PCs Together

The back-to-back connection between the two systems represents the internal LAN where each system would need a unique IP address. You would then hook you modem up to the Debian system to set up the Internet connection (as outlined on the Modems page). And just as shown in the above diagram, the IP address of the NIC on the Debian system would be the default gateway entry on the Windows system. This is because the Debian system will be acting as the gateway for this two-system network.

This configuration will also work if you want to play around with the servers we set up on the Internet Servers page.


How It All Works

So you've got IP addresses and MAC addresses and subnet masks and default gateways and DNS and ARP and hardware and software. How do all these pieces fit together? Consider the network below.

How a network works

The user of the Windows PC in the Sales department fires up their Web browser so that they can access the company's internal intranet Web servers. There's a corporate intranet Web server with company-wide information and one right in the user's department on the same network segment with information for Sales staff.

Remember that each system has two addresses (logical and physical) and in order for one system to communicate with another, it needs to include both of these addresses when it puts a packet together to send to the destination system.

Friday, September 28, 2007

GNU/Linux FAQ by Richard Stallman

When people see that we use and recommend the name GNU/Linux for a system that many others call just “Linux”, they ask many questions. Here are common questions, and our answers.

Why do you call it GNU/Linux and not Linux?
Most operating system distributions based on Linux as kernel are basically modified versions of the GNU operating system. We began developing GNU in 1984, years before Linus Torvalds started to write his kernel. Our goal was to develop a complete free operating system. Of course, we did not develop all the parts ourselves—but we led the way. We developed most of the central components, forming the largest single contribution to the whole system. The basic vision was ours too.

In fairness, we ought to get at least equal mention.

See Linux and the GNU Project and GNU Users Who Have Never Heard of GNU for more explanation, and The GNU Project for the history.

Why is the name important?
Although the developers of Linux, the kernel, are contributing to the free software community, many of them do not care about freedom. People who think the whole system is Linux tend to get confused and assign to those developers a role in the history of our community which they did not actually play. Then they give inordinate weight to those developers' views.

Calling the system GNU/Linux recognizes the role that our idealism played in building our community, and helps the public recognize the practical importance of these ideals.

How did it come about that most people call the system “Linux”?
Calling the system “Linux” is a confusion that has spread faster than the corrective information.

The people who combined Linux with the GNU system were not aware that that's what their activity amounted to. They focused their attention on the piece that was Linux and did not realize that more of the combination was GNU. They started calling it “Linux” even though that name did not fit what they had. It took a few years for us to realize what a problem this was and ask people to correct the practice. By that time, the confusion had a big head start.

Most of the people who call the system “Linux” have never heard why that's not the right thing. They saw others using that name and assume it must be right. The name “Linux” also spreads a false picture of the system's origin, because people tend to suppose that the system's history was such as to fit that name. For instance, they often believe its development was started by Linus Torvalds in 1991. This false picture tends to reinforce the idea that the system should be called “Linux”.

Many of the questions in this file represent people's attempts to justify the name they are accustomed to using.

Should we always say “GNU/Linux” instead of “Linux”?
Not always—only when you're talking about the whole system. When you're referring specifically to the kernel, you should call it “Linux”, the name its developer chose.

When people call the whole system “Linux”, as a consequence they call the whole system by the same name as the kernel. This causes many kinds of confusion, because only experts can tell whether a statement is about the kernel or the whole system. By calling the whole system “GNU/Linux”, and calling the kernel “Linux”, you avoid the ambiguity.

Would Linux have achieved the same success if there had been no GNU?
In that alternative world, there would be nothing today like the GNU/Linux system, and probably no free operating system at all. No one attempted to develop a free operating system in the 1980s except the GNU Project and (later) Berkeley CSRG, which had been specifically asked by the GNU Project to start freeing its code.

Linus Torvalds was partly influenced by a speech about GNU in Finland in 1990. It's possible that even without this influence he might have written a Unix-like kernel, but it probably would not have been free software. Linux became free in 1992 when Linus rereleased it under the GNU GPL. (See the release notes for version 0.12.)

Even if Torvalds had released Linux under some other free software license, a free kernel alone would not have made much difference to the world. The significance of Linux came from fitting into a larger framework, a complete free operating system: GNU/Linux.

Wouldn't it be better for the community if you did not divide people with this request?
When we ask people to say “GNU/Linux”, we are not dividing people. We are asking them to give the GNU Project credit for the GNU operating system. This does not criticize anyone or push anyone away.

However, there are people who do not like our saying this. Sometimes those people push us away in response. On occasion they are so rude that one wonders if they are intentionally trying to intimidate us into silence. It doesn't silence us, but it does tend to divide the community, so we hope you can convince them to stop.

However, this is only a secondary cause of division in our community. The largest division in the community is between people who appreciate free software as a social and ethical issue and consider proprietary software a social problem (supporters of the free software movement), and those who cite only practical benefits and present free software only as an efficient development model (the open source movement).

This disagreement is not just a matter of names—it is a matter of differing basic values. It is essential for the community to see and think about this disagreement. The names “free software” and “open source” are the banners of the two positions. See Why Free Software Is Better Than Open Source.

The disagreement over values partially aligns with the amount of attention people pay to the GNU Project's role in our community. People who value freedom are more likely to call the system “GNU/Linux”, and people who learn that the system is “GNU/Linux” are more likely to pay attention to our philosophical arguments for freedom and community (which is why the choice of name for the system makes a real difference for society). However, the disagreement would probably exist even if everyone knew the system's real origin and its proper name, because the issue is a real one. It can only go away if we who value freedom either persuade everyone (which won't be easy) or are defeated entirely (let's hope not).

Doesn't the GNU project support an individual's free speech rights to call the system by any name that individual chooses?
Yes, indeed, we believe you have a free speech right to call the operating system by any name you wish. We ask that people call it GNU/Linux as a matter of doing justice to the GNU project, to promote the values of freedom that GNU stands for, and to inform others that those values of freedom brought the system into existence.
Since everyone knows GNU's role in developing the system, doesn't the “GNU/” in the name go without saying?
Experience shows that the system's users, and the computer-using public in general, often know nothing about the GNU system. Most articles about the system do not mention the name “GNU”, or the ideals that GNU stands for. GNU Users Who Have Never Heard of GNU explains further.

The people who say this are probably geeks thinking of the geeks they know. Geeks often do know about GNU, but many have a completely wrong idea of what GNU is. For instance, many think it is a collection of “tools”, or a project to develop tools.

The wording of this question, which is typical, illustrates another common misconception. To speak of “GNU's role” in developing something assumes that GNU is a group of people. GNU is an operating system. It would make sense to talk about the GNU Project's role in this or some other activity, but not that of GNU.

Isn't shortening “GNU/Linux” to “Linux” just like shortening “Microsoft Windows” to “Windows”?
It's useful to shorten a frequently-used name, but not if the abbreviation is misleading.

Most everyone in developed countries really does know that the “Windows” system is made by Microsoft, so shortening “Microsoft Windows” to “Windows” does not mislead anyone as to that system's nature and origin. Shortening “GNU/Linux” to “Linux” does give the wrong idea of where the system comes from.

The question is itself misleading because GNU and Microsoft are not the same kind of thing. Microsoft is a company; GNU is an operating system.

Isn't GNU a collection of programming tools that were included in Linux?
People who think that Linux is an entire operating system, if they hear about GNU at all, often get a wrong idea of what GNU is. They may think that GNU is the name of a collection of programs—often they say “programming tools”, since some of our programming tools became popular on their own. The idea that “GNU” is the name of an operating system is hard to fit into a conceptual framework in which that operating system is labeled “Linux”.

The GNU Project was named after the GNU operating system—it's the project to develop the GNU system. (See the 1983 initial announcement.)

We developed programs such as GCC, GNU Emacs, GAS, GLIBC, BASH, etc., because we needed them for the GNU operating system. GCC, the GNU Compiler Collection is the compiler that we wrote for the GNU operating system. We, the many people working on the GNU Project, developed Ghostscript, GNUCash, GNU Chess and GNOME for the GNU system too.

What is the difference between an operating system and a kernel?
An operating system, as we use the term, means a collection of programs that are sufficient to use the computer to do a wide variety of jobs. A general purpose operating system, to be complete, ought to handle all the jobs that many users may want to do.

The kernel is one of the programs in an operating system—the program that allocates the machine's resources to the other programs that are running. The kernel also takes care of starting and stopping other programs.

To confuse matters, some people use the term “operating system” to mean “kernel”. Both uses of the term go back many years. The use of “operating system” to mean “kernel” is found in a number of textbooks on system design, going back to the 80s. At the same time, in the 80s, the “Unix operating system” was understood to include all the system programs, and Berkeley's version of Unix included even games. Since we intended GNU to be a Unix-like operating system, we use the term “operating system” in the same way.

Most of the time when people speak of the “Linux operating system” they are using “operating system” in the same sense we use: they mean the whole collection of programs. If that's what you are referring to, please call it “GNU/Linux”. If you mean just the kernel, then “Linux” is the right name for it, but please say “kernel” also to avoid ambiguity about which body of software you mean.

If you prefer to use some other term such as “system distribution” for the entire collection of programs, instead of “operating system”, that's fine. Then you would talk about GNU/Linux system distributions.

We're calling the whole system after the kernel, Linux. Isn't it normal to name an operating system after a kernel?
That practice seems to be very rare—we can't find any examples other than the misuse of the name “Linux”. Normally an operating system is developed as a single unified project, and the developers choose a name for the system as a whole. The kernel usually does not have a name of its own—instead, people say “the kernel of such-and-such” or “the such-and-such kernel”.

Because those two constructions are used synonymously, the expression “the Linux kernel” can easily be misunderstood as meaning “the kernel of Linux” and implying that Linux must be more than a kernel. You can avoid the possibility of this misunderstanding by saying or writing “the kernel, Linux” or “Linux, the kernel.”

The problem with “GNU/Linux” is that it is too long. How about recommending a shorter name?
For a while we tried the name “LiGNUx”, which combines the words “GNU” and “Linux”. The reaction was very bad. People accept “GNU/Linux” much better.

The shortest legitimate name for this system is “GNU”, but we call it “GNU/Linux” for the reasons given below.

Since Linux is a secondary contribution, would it be false to the facts to call the system simply “GNU”?
It would not be false to the facts, but it is not the best thing to do. Here are the reasons we call that system version “GNU/Linux” rather than just “GNU”:
  • It's not exactly GNU—it has a different kernel (that is, Linux). Distinguishing GNU/Linux from GNU is useful.
  • It would be ungentlemanly to ask people to stop giving any credit to Linus Torvalds. He did write an important component of the system. We want to get credit for launching and sustaining the system's development, but this doesn't mean we should treat Linus the same way those who call the system “Linux” treat us. We strongly disagree with his political views, but we deal with that disagreement honorably and openly, rather than by trying to cut him out of the credit for his contribution to the system.
  • Since many people know of the system as “Linux”, if we say “GNU” they may simply not recognize we're talking about the same system. If we say “GNU/Linux”, they can make a connection to what they have heard about.
I would have to pay a fee if I use “Linux” in the name of a product, and that would also apply if I say “GNU/Linux”. Is it wrong if I use “GNU” without “Linux”, to save the fee?
There's nothing wrong in calling the system “GNU”; basically, that's what it is. It is nice to give Linus Torvalds a share of the credit as well, but you have no obligation to pay for the privilege of doing so.

So if you want to refer to the system simply as “GNU”, to avoid paying the fee for calling it “Linux”, we won't criticize you.

Many other projects contributed to the system as it is today; it includes TeX, X11, Apache, Perl, and many more programs. Don't your arguments imply we have to give them credit too? (But that would lead to a name so long it is absurd.)
What we say is that you ought to give the system's principal developer a share of the credit. The principal developer is the GNU Project, and the system is basically GNU.

If you feel even more strongly about giving credit where it is due, you might feel that some secondary contributors also deserve credit in the system's name. If so, far be it from us to argue against it. If you feel that X11 deserves credit in the system's name, and you want to call the system GNU/X11/Linux, please do. If you feel that Perl simply cries out for mention, and you want to write GNU/Linux/Perl, go ahead.

Since a long name such as GNU/X11/Apache/Linux/TeX/Perl/Python/FreeCiv becomes absurd, at some point you will have to set a threshold and omit the names of the many other secondary contributions. There is no one obvious right place to set the threshold, so wherever you set it, we won't argue against it.

Different threshold levels would lead to different choices of name for the system. But one name that cannot result from concerns of fairness and giving credit, not for any possible threshold level, is “Linux”. It can't be fair to give all the credit to one secondary contribution (Linux) while omitting the principal contribution (GNU).

Many other projects contributed to the system as it is today, but they don't insist on calling it XYZ/Linux. Why should we treat GNU specially?
Thousands of projects have developed programs commonly included in today's GNU/Linux systems. They all deserve credit for their contributions, but they aren't the principal developers of the system as a whole, so they don't ask to be credited as such.

GNU is different because it is more than just a contributed program, more than just a collection of contributed programs. GNU is the framework on which the system was made.

Many companies contributed to the system as it is today; doesn't that mean we ought to call it GNU/Redhat/Novell/Linux?

GNU is not comparable to Red Hat or Novell; it is not a company, or an organization, or even an activity. GNU is an operating system. (When we speak of the GNU Project, that refers to the project to develop the GNU system.) The GNU/Linux system is based on GNU, and that's why GNU ought to appear in its name.

Much of those companies' contribution to the GNU/Linux system lies in the code they have contributed to various GNU packages including GCC and GNOME. Saying GNU/Linux gives credit to those companies along with all the rest of the GNU developers.

Why do you write “GNU/Linux” instead of “GNU Linux”?
Following the rules of English, in the construction “GNU Linux” the word “GNU” modifies “Linux”. This can mean either “GNU's version of Linux” or “Linux, which is a GNU package.” Neither of those meanings fits the situation at hand.

Linux is not a GNU package; that is, it wasn't developed under the GNU Project's aegis or contributed specifically to the GNU Project. Linus Torvalds wrote Linux independently, as his own project. So the “Linux, which is a GNU package” meaning is not right.

We're not talking about a distinct GNU version of Linux, the kernel. The GNU Project does not have a separate version of Linux. GNU/Linux systems don't use a different version of Linux; the main use of Linux is in these systems, and the standard version of Linux is developed for them. So “GNU's version of Linux” is not right either.

We're talking about a version of GNU, the operating system, distinguished by having Linux as the kernel. A slash fits the situation because it means “combination.” (Think of “Input/Output”.) This system is the combination of GNU and Linux; hence, “GNU/Linux”.

There are other ways to express “combination”. If you think that a plus-sign is clearer, please use that. In French, a dash is clear: “GNU-Linux”. In Spanish, we sometimes say “GNU con Linux”.

Why “GNU/Linux” rather than “Linux/GNU”?
It is right and proper to mention the principal contribution first. The GNU contribution to the system is not only bigger than Linux and prior to Linux, we actually started the whole activity.

However, if you prefer to call the system “Linux/GNU”, that is a lot better than what people usually do, which is to omit GNU entirely and make it seem that the whole system is Linux.

My distro is called “Foobar Linux”; doesn't that show it's really Linux?

It means that the people who make the “Foobar Linux” distro are repeating the common mistake.

My distro's official name is “Foobar Linux”; isn't it wrong to call the distro anything but “Linux”?

If it's allowed for them to change “GNU” to “Foobar Linux”, it's allowed for you to change it back and call the distro “Foobar GNU/Linux”. It can't be more wrong to correct the mistake than it was to make the mistake.

Wouldn't it be more effective to ask companies such as Mandrake, Red Hat and IBM to call their distributions “GNU/Linux” rather than asking individuals?
It isn't a choice of one or the other—we ask companies and organizations and individuals to help spread the word. In fact, we have asked all three of those companies. Mandrake uses the term “GNU/Linux” some of the time, but IBM and Red Hat were unwilling to help. One executive said, “This is a pure commercial decision; we expect to make more money calling it `Linux'.” In other words, that company did not care what was right.

We can't make them change, but we're not the sort to give up just because the road isn't easy. You may not have as much influence at your disposal as IBM or Red Hat, but you can still help. Together we can change the situation to the point where companies will make more profit calling it “GNU/Linux”.

Wouldn't it be better to reserve the name “GNU/Linux” for distributions that are purely free software? After all, that is the ideal of GNU.
The widespread practice of adding non-free software to the GNU/Linux system is a major problem for our community. It teaches the users that non-free software is ok, and that using it is part of the spirit of “Linux”. Many “Linux” User Groups make it part of their mission to help users use non-free add-ons, and may even invite salesmen to come and make sales pitches for them. They adopt goals such as “helping the users” of GNU/Linux (including helping them use non-free applications and drivers), or making the system more popular even at the cost of freedom.

The question is how to try to change this.

Given that most of the community which uses GNU with Linux already does not realize that's what it is, for us to disown these adulterated versions, saying they are not really GNU, would not teach the users to value freedom more. They would not get the intended message. They would only respond they never thought these systems were GNU in the first place.

The way to lead these users to see a connection with freedom is exactly the opposite: to inform them that all these system versions are versions of GNU, that they all are based on a system that exists specifically for the sake of the users' freedom. With this understanding, they can start to recognize the distributions that include non-free software as perverted, adulterated versions of GNU, instead of thinking they are proper and appropriate “versions of Linux”.

It is very useful to start GNU/Linux User Groups, which call the system GNU/Linux and adopt the ideals of the GNU Project as a basis for their activities. If the Linux User Group in your area has the problems describe above, we suggest you either campaign within the group to change its orientation (and name) or start a new group. The people who focus on the more superficial goals have a right to their views, but don't let them drag you along!

Why not make a GNU distribution of Linux (sic) and call that GNU/Linux?
All the “Linux” distributions are actually versions of the GNU system with Linux as the kernel. The purpose of the term “GNU/Linux” is to communicate this point. To develop one new distribution and call that alone “GNU/Linux” would obscure the point we want to make.

As for developing a distribution of GNU/Linux, we already did this once, when we funded the early development of Debian GNU/Linux. To do it again now does not seem useful; it would be a lot of work, and unless the new distribution had substantial practical advantages over other distributions, it would serve no purpose.

Instead we help the developers of 100% free GNU/Linux distributions, such as gNewSense and Ututo.

Why not just say “Linux is the GNU kernel” and release some existing version of GNU/Linux under the name “GNU”?
It might have been a good idea to adopt Linux as the GNU kernel back in 1992. If we had realized, then, how long it would take to get the GNU Hurd to work, we might have done that. (Alas, that is hindsight.)

If we were to take an existing version of GNU/Linux and relabel it as “GNU”, that would be somewhat like making a version of the GNU system and labeling it “Linux”. That wasn't right, and we don't want to act like that.

Did the GNU Project condemn and oppose use of Linux in the early days?
We did not adopt Linux as our kernel, but we didn't condemn or oppose it. In 1993 we started discussing the arrangements to sponsor the development of Debian GNU/Linux. We also sought to cooperate with the people who were changing some GNU packages for use with Linux. We wanted to include their changes in the standard releases so that these GNU packages would work out-of-the-box in combination with Linux. But the changes were often ad-hoc and nonportable; they needed to be cleaned up for installation.

The people who had made the changes showed little interest in cooperating with us. One of them actually told us that he didn't care about working with the GNU Project because he was a “Linux user”. That came as a shock, because the people who ported GNU packages to other systems had generally wanted to work with us to get their changes installed. Yet these people, developing a system that was primarily based on GNU, were the first (and still practically the only) group that was unwilling to work with us.

It was this experience that first showed us that people were calling a version of the GNU system “Linux”, and that this confusion was obstructing our work. Asking you to call the system “GNU/Linux” is our response to that problem, and to the other problems caused by the “Linux” misnomer.

Why did you wait so long before asking people to use the name GNU/Linux?

Actually we didn't. We began talking privately with developers and distributors about this in 1994, and made a more public campaign in 1996. We will continue for as long as it's necessary.

Should the GNU/[name] convention be applied to all programs that are GPL'ed?
We never refer to individual programs as “GNU/[name]”. When a program is a GNU package, we may call it “GNU [name]”.

GNU, the operating system, is made up of many different programs. Some of the programs in GNU were written as part of the GNU Project or specifically contributed to it; these are the GNU packages, and we often use “GNU” in their names.

It's up to the developers of a program to decide if they want to contribute it and make it a GNU package. If you have developed a program and you would like it to be a GNU package, please write to , so we can evaluate it and decide whether we want it.

It wouldn't be fair to put the name GNU on every individual program that is released under the GPL. If you write a program and release it under the GPL, that doesn't mean the GNU Project wrote it or that you wrote it for us. For instance, the kernel, Linux, is released under the GNU GPL, but Linus did not write it as part of the GNU Project—he did the work independently. If something is not a GNU package, the GNU Project can't take credit for it, and putting “GNU” in its name would be improper.

In contrast, we do deserve the overall credit for the GNU operating system as a whole, even though not for each and every program in it. The system exists as a system because of our determination and persistence, starting in 1984, many years before Linux was begun.

The operating system in which Linux became popular was basically the same as the GNU operating system. It was not entirely the same, because it had a different kernel, but it was mostly the same system. It was a variant of GNU. It was the GNU/Linux system.

Linux continues to be used primarily in derivatives of that system—in today's versions of the GNU/Linux system. What gives these systems their identity is GNU and Linux at the center of them, not particularly Linux alone.

Since much of GNU comes from Unix, shouldn't GNU give credit to Unix by using “Unix” in its name?
Actually, none of GNU comes from Unix. Unix was proprietary software (and still is), so using any of its code in GNU would have been illegal. This is not a coincidence; this is why we developed GNU: since you could not have freedom in using Unix, or any of the other operating systems of the day, we needed a free system to replace it. We could not copy programs, or even parts of them, from Unix; everything had to be written afresh.

No code in GNU comes from Unix, but GNU is a Unix-compatible system; therefore, many of the ideas and specifications of GNU do come from Unix. The name “GNU” is a humorous way of paying tribute to Unix, following a hacker tradition of recursive acronyms that started in the 70s.

The first such recursive acronym was TINT, “TINT Is Not TECO”. The author of TINT wrote another implementation of TECO (there were already many of them, for various systems), but instead of calling it by a dull name like “somethingorother TECO”, he thought of a clever amusing name. (That's what hacking means: playful cleverness.)

Other hackers enjoyed that name so much that we imitated the approach. It became a tradition that, when you were writing from scratch a program that was similar to some existing program (let's imagine its name was “Klever”), you could give it a recursive acronym name, such as “MINK” for “MINK Is Not Klever.” In this same spirit we called our replacement for Unix “GNU's Not Unix”.

Historically, AT&T which developed Unix did not want anyone to give it credit by using “Unix” in the name of a similar system, not even in a system 99% copied from Unix. AT&T actually threatened to sue anyone giving AT&T credit in that way. This is why each of the various modified versions of Unix (each of them just as proprietary as Unix) had a completely different name that didn't include “Unix”.

Should we say “GNU/BSD” too?
We don't call the BSD systems (FreeBSD, etc.) “GNU/BSD” systems, because that term does not fit the history of the BSD systems.

The BSD system was developed by UC Berkeley as non-free software in the 80s, and became free in the early 90s. A free operating system that exists today is almost certainly either a variant of the GNU system, or a kind of BSD system.

People sometimes ask whether BSD too is a variant of GNU, as GNU/Linux is. It is not. The BSD developers were inspired to make their code free software by the example of the GNU Project, and explicit appeals from GNU activists helped convince them to start, but the code had little overlap with GNU.

BSD systems today use some GNU packages, just as the GNU system and its variants use some BSD programs; however, taken as wholes, they are two different systems that evolved separately. The BSD developers did not write a kernel and add it to the GNU system, so a name like GNU/BSD would not fit the situation.

The connection between GNU/Linux and GNU is much closer, and that's why the name “GNU/Linux” is appropriate for it.

There is a version of GNU which uses the kernel from NetBSD. Its developers call it “Debian GNU/NetBSD”, but “GNU/kernelofNetBSD” would be more accurate, since NetBSD is an entire system, not just the kernel. This is not a BSD system, since most of the system is the same as the GNU/Linux system.

If I install the GNU tools on Windows, does that mean I am running a GNU/Windows system?
Not in the same sense that we mean by “GNU/Linux”. The tools of GNU are just a part of the GNU software, which is just a part of the GNU system, and underneath them you would still have another complete operating system which has no code in common with GNU. All in all, that's a very different situation from GNU/Linux.
Can't there be Linux systems without GNU?
It is possible to make a system that uses Linux as the kernel but is not based on GNU. I'm told there are small systems, used for embedded development, that include Linux but not the GNU system. IBM was once rumored to be planning to put Linux as the kernel into AIX; whether or not they actually tried to do this, it is theoretically possible. What conclusions can we draw from this about the naming of various systems?

People who think of the kernel as more important than all the rest of the system say, “They all contain Linux, so let's call them all Linux systems.” But any two of these systems are mostly different, and calling them by the same name is misleading. (It leads people to think that the kernel is more important than all the rest of the system, for instance.)

In the small embedded systems, Linux may be most of the system; perhaps “Linux systems” is the right name for them. They are very different from GNU/Linux systems, which are more GNU than Linux. The hypothetical IBM system would be different from either of those. The right name for it would be AIX/Linux: basically AIX, but with Linux as the kernel. These different names would show users how these systems are different.

Why not call the system “Linux” anyway, and strengthen Linus Torvalds' role as posterboy for our community?
Linus Torvalds is the “posterboy” (other people's choice of word, not ours) for his goals, not ours. His goal is to make the system more popular, and he believes its value to society lies merely in the practical advantages it offers: its power, reliability and easy availability. He has never advocated freedom to cooperate as an ethical principle, which is why the public does not connect the name “Linux” with that principle.

Linus publicly states his disagreement with the free software movement's ideals. He developed non-free software in his job for many years (and said so to a large audience at a “Linux”World show), and publicly invited fellow developers of Linux, the kernel, to use non-free software to work on it with him. He goes even further, and rebukes people who suggest that engineers and scientists should consider social consequences of our technical work—rejecting the lessons society learned from the development of the atom bomb.

There is nothing wrong with writing a free program for the motivations of learning and having fun; the kernel Linus wrote for those reasons was an important contribution to our community. But those motivations are not the reason why the complete free system, GNU/Linux, exists, and they won't secure our freedom in the future. The public needs to know this. Linus has the right to promote his views; however, people should be aware that the operating system in question stems from ideals of freedom, not from his views.

Does Linus Torvalds agree that Linux is just the kernel?

He recognized this at the beginning. The earliest Linux release notes said, “Most of the tools used with linux are GNU software and are under the GNU copyleft. These tools aren't in the distribution - ask me (or GNU) for more info”.

The battle is already lost—society has made its decision and we can't change it, so why even think about it?
This isn't a battle, it is a campaign of education. What to call the system is not a single decision, to be made at one moment by “society”: each person, each organization, can decide what name to use. You can't make others say “GNU/Linux”, but you can decide to call the system “GNU/Linux” yourself—and by doing so, you will make a difference.
Society has made its decision and we can't change it, so what good does it do if I say “GNU/Linux”?
This is not an all-or-nothing situation: correct and incorrect pictures are being spread more or less by various people. If you call the system “GNU/Linux”, you will help others learn the system's true history, origin, and reason for being. You can't correct the misnomer everywhere on your own, any more than we can, but you can help. If only a few hundred people see you use the term “GNU/Linux”, you will have educated a substantial number of people with very little work. And some of them will spread the correction to others.
Wouldn't it be better to call the system “Linux” and teach people its real origin with a ten-minute explanation?
If you help us by explaining to others in that way, we appreciate your effort, but that is not the best method. It is not as effective as calling the system “GNU/Linux”, and uses your time inefficiently.

It is ineffective because it may not sink in, and surely will not propagate. Some of the people who hear your explanation will pay attention, and they may learn a correct picture of the system's origin. But they are unlikely to repeat the explanation to others whenever they talk about the system. They will probably just call it “Linux”. Without particularly intending to, they will help spread the incorrect picture.

It is inefficient because it takes a lot more time. Saying and writing “GNU/Linux” will take you only a few seconds a day, not minutes, so you can afford to reach far more people that way. Distinguishing between Linux and GNU/Linux when you write and speak is by far the easiest way to help the GNU Project effectively.

Some people laugh at you when you ask them to call the system GNU/Linux. Why do you subject yourself to this treatment?
Calling the system “Linux” tends to give people a mistaken picture of the system's history and reason for existence. People who laugh at our request probably have picked up that mistaken picture—they think our work was done by Linus, so they laugh when we ask for credit for it. If they knew the truth, they probably wouldn't laugh.

Why do we take the risk of making a request that sometimes leads people to ridicule us? Because often it has useful results that help the GNU Project. We will run the risk of undeserved abuse to achieve our goals.

If you see such an ironically unfair situation occurring, please don't sit idly by. Please teach the laughing people the real history. When they see why the request is justified, those who have any sense will stop laughing.

Some people condemn you when you ask them to call the system GNU/Linux. Don't you lose by alienating them?
Not much. People who don't appreciate our role in developing the system are unlikely to make substantial efforts to help us. If they do work that advances our goals, such as releasing free software, it is probably for other unrelated reasons, not because we asked them. Meanwhile, by teaching others to attribute our work to someone else, they are undermining our ability to recruit the help of others.

It makes no sense to worry about alienating people who are already mostly uncooperative, and it is self-defeating to be deterred from correcting a major problem lest we anger the people who perpetuate it. Therefore, we will continue trying to correct the misnomer.

Whatever you contributed, is it legitimate to rename the operating system?
We are not renaming anything; we have been calling this system “GNU” ever since we announced it in 1983. The people who tried to rename it to “Linux” should not have done so.
Isn't it wrong to force people to call the system “GNU/Linux”?
It would be wrong to force them, and we don't try. We call the system “GNU/Linux”, and we ask you to do it too.
Why not sue people who call the whole system “Linux”?
There are no legal grounds to sue them, but since we believe in freedom of speech, we wouldn't want to do that anyway. We ask people to call the system “GNU/Linux” because that is the right thing to do.
Shouldn't you put something in the GNU GPL to require people to call the system “GNU”?
The purpose of the GNU GPL is to protect the users' freedom from those who would make proprietary versions of free software. While it is true that those who call the system “Linux” often do things that limit the users' freedom, such as bundling non-free software with the GNU/Linux system or even developing non-free software for such use, the mere act of calling the system “Linux” does not, in itself, deny users their freedom. It seems improper to make the GPL restrict what name people can use for the system.
Since you objected to the original BSD license's advertising requirement to give credit to the University of California, isn't it hypocritical to demand credit for the GNU project?
It would be hypocritical to make the name GNU/Linux a license requirement, and we don't. We only ask you to give us the credit we deserve.
Since you failed to put something in the GNU GPL to require people to call the system “GNU”, you deserve what happened; why are you complaining now?
The question presupposes a rather controversial general ethical premise: that if people do not force you to treat them fairly, you are entitled to take advantage of them as much as you like. In other words, it assumes that might makes right.

We hope you disagree with that premise just as we do.

Wouldn't you be better off not contradicting what so many people believe?
We don't think we should go along with large numbers of people because they have been misled. We hope you too will decide that truth is important.

We could never have developed a free operating system without first denying the belief, held by most people, that proprietary software was legitimate and acceptable.

Since many people call it “Linux”, doesn't that make it right?
We don't think that the popularity of an error makes it the truth.
Many people care about what's convenient or who's winning, not about arguments of right or wrong. Couldn't you get more of their support by a different road?
To care only about what's convenient or who's winning is an amoral approach to life. Non-free software is an example of that amoral approach and thrives on it. So in the long run it is self-defeating for us to bow to that approach. We will continue talking in terms of right and wrong.

We hope that you are one of those for whom right and wrong do matter.