Project 1: Socket Basics

Get started writing code using network sockets

Description

This assignment is intended to familiarize you with writing simple network code. You will implement a client program that communicates with a server using sockets. The server will ask your program to do some basic string manipulation and counting. If your program successfully counts all of the strings, then the server will return a secret flag that is unique for each student. If you receive the secret flag, then you know that your program has run successfully.

Your client must support TLS encrypted sockets. The server will return different secret flags depending on whether your client communicates with or without TLS. To receive full credit on this project, you must turn in both secret flags: the one retrieved via a non-encrypted socket, and the one retrieved via a TLS encrypted socket.

Language

You can write your code in whatever language you choose, as long as your code compiles and runs on unmodified Khoury College Linux machines on the command line. Do not use libraries that are not installed by default on the Khoury College Linux machines. Similarly, your code must compile and run on the command line. You may use IDEs (e.g., Eclipse) during development, but do not turn in your IDE project without a Makefile. Make sure you code has no dependencies on your IDE.

Protocol

The server runs on the machine proj1.3700.network and listens for non-encrypted requests on a TCP socket bound to port 27993. This exercise has four types of messages: HELLO, FIND, COUNT, and BYE. Each message is an ASCII string consisting of multiple fields separated by spaces (0x20) and terminated with a line feed (0x0A, \n). The maximum length of each message is 8192 bytes. Messages are case sensitive.

The protocol works as follows. The client initiates the protocol by creating a TCP socket connection to the server. Once the socket is connected, the client sends a HELLO message to the server. The format of the HELLO message is:

ex_string HELLO [your NEU ID]\n

In your program you should replace [your NEU ID] with your NEU ID (including any leading zeroes). You must supply your NEU ID so the server can look up the appropriate secret flag for you. The server will reply with a FIND message. The format of the FIND message is:

ex_string FIND [A single ASCII symbol] [A string of random characters]\n

The two variable fields contain (1) a single ASCII symbol, such as “A”, “f”, “4”, or “%”, without quotes, and (2) a string of random characters. Your program must count the number of times the given ASCII symbol appears in the random string and return this count to the server in a COUNT message. The COUNT message has the following format:

ex_string COUNT [the count of the given symbol in the given string]\n

It is okay for the count to be zero.

The server will respond to the COUNT message with either another FIND message, or a BYE message. If the server terminates the connection, that means your count was incorrect. If the server sends another FIND message, your program must count the occurrences of the new given symbol and return another COUNT message. The server will ask your program to count hundreds of strings; the exact number of strings is chosen at random. Eventually, the server will return a BYE message.

The BYE message has the following format:

ex_string BYE [a 64 byte secret flag]\n

Once your program has received the BYE message, it can close the connection to the server. If the server returns “Unknown_Husky_ID” in the BYE message, that means it did not recognize the NEU ID that you supplied in the HELLO message. Otherwise, the 64-byte string is your secret flag: write this value down, since you need to turn it in along with your code.

Your client program

Your client program must execute on the command line using the following command.

$ ./client <-p port> <-s> [hostname] [NEU ID]

Your program must follow this command line syntax exactly, i.e., your program must be called client and it must accept these two optional and two required parameters in exactly this order. If you cannot name your program client (e.g., your program is in Java and you can only generate client.class) then you must include a script called client in your submission that accepts these parameters and then executes your actual program. Keep in mind that all of your submissions will be evaluated by grading scripts; if your program does not conform exactly to the specification then the grading scripts may fail, which will result in loss of points.

  • The -p port parameter is optional; it specifies the TCP port that the server is listening on. If this parameter is not supplied on the command line, your program must assume that the port is 27993.
  • The -s flag is optional; if given, the client should use an TLS encrypted socket connection. If this parameter is supplied on the command line and -p is not specified, your program must assume that the port is 27994.
  • The [hostname] parameter is required, and specifies the name of the server (either a DNS name or an IP address in dotted notation).
  • The [NEU ID] parameter is required. Your code must support NEU IDs that have leading zeroes (do not strip them!).

Note that when we say a parameter is “optional”, that means it may or may not be given on the command line when a person invokes your program. It does not mean that you may choose to not implement support for the parameter, i.e., implementing this functionality is not optional.

Your program should print exactly one line of output: the secret flag from the server’s BYE message. If your program encounters an error, it may print an error message before terminating. Your program should not write any files to disk, especially to the secret_flags file!

Encrypted Communication with TLS

In addition to supporting the unencrypted version of the protocol specified above, your client program must also support an encrypted version of the protocol. To accomplish this, you must modify your client such that it supports TLS connections. If the -s parameter is given to your program, it should connect to the server using an encrypted TLS socket and then complete the protocol normally (i.e., HELLO, FIND, COUNT, and BYE). You may assume that the server’s TLS port is 27994, unless the port is overridden on the command line using the -p option.

All modern programming languages have support for TLS encrypted sockets. You may use libraries, modules, etc. to facilitate adding this functionality to your client program.

When you successfully run your TLS-enabled client against the TLS version of the server (using port 27994), you will receive a new secret flag (that is different from the normal secret flag). You must add this TLS secret flags into the secret_flags file when you turn in your project (i.e., your secret flags file will eventually contain two flags).

Other Considerations

You may test your client code with our server as many times as you like. Your client should conform to the protocol described above, otherwise the server will terminate the connection silently. Your client program must verify the validity of messages by strictly checking their format, i.e., the server may send corrupted messages just to try and crash your software. If a received message is not as expected, such as an incorrect field or wrong message type, you must assert an error and terminate your program. You should be strict; if the returned message does not exactly conform to the specification above, you should assert an error. Remember that network-facing code should be written defensively.

Submitting Your Project

To turn-in your project, you should submit the following four things:

  1. Your thoroughly documented source code that implements your client program.
  2. A Makefile that compiles your code. You must turn-in a Makefile, even if your code does not need to be compiled (e.g., your code is in Python or Ruby). You may leave the Makefile blank in this case.
  3. A plain-text (no Word or PDF) README.md file. In this file, you should briefly describe your high-level approach, any challenges you faced, and an overview of how you tested your code.
  4. A file called secret_flags. This file should contain both of your secret flags, one per line, in plain ASCII.

Your README.md, Makefile, secret_flags file, source code, etc. should all be placed in the root of a compressed archive (e.g., a .zip or .tar.gz) and then uploaded to Gradescope. Alternatively, you can check these items in to Github and then instruct Gradescope to clone your Github repository.

Double Checking Your Submission

To try and make sure that your submission is (1) complete and (2) will work with our grading scripts, we provide a simple script that checks the formatting of your submission. You can download the script here and invoke it using the following command:

$ ./socketbasics_fmt_chk.py [path to your project directory]

Note that you may need to chmod +x socketbasics_fmt_chk.py to make the script executable.

This script will attempt to make sure that the correct files (e.g., README.md, secret_flags, and Makefile) are available in the given directory, that your secret_key file contains at least two 64-byte keys, that your Makefile will run without errors (or is empty), and that after running the Makefile a program named client exists in the directory. The script will also try to determine if your files use Windows-style line endings (\r\n) as opposed to Unix-style line endings (\n). If your files are Windows-encoded, you should convert them to Unix-encoding using the dos2unix utility before turning in.

Grading

This project is worth 5% of your final grade. If your program compiles, runs correctly, and you successfully submit both secret flags, then you will receive full credit. All student code will be scanned by plagiarism detection software to ensure that students are not copying code from the Internet or each other.

FAQ

Here are a few common questions that get asked about this project:

  • Question: does the [Makefile, README.md, secret_flags, client] file need be named exactly that? Can I turn in [README.txt, client.py, secret_flags.whatever, etc.] instead? NO. The files need to be named exactly what we have specified in this document. If you don’t follow the specification exactly, you will lose points.
  • Why do we need a Makefile? This is the consequence of letting you write your program in whatever language you want. Since you can turn-in whatever crazy source code you want, we need to set a couple of ground rules so that we can compile and run your code. Those ground rules are (1) everyone turns in a Makefile, and (2) everyone’s program must be named client.
  • I’m using Java, and I can’t get TLS to work. My program is complaining about invalid, self-signed certificates. Good on you for using a language that bothers to check whether the server’s certificate is valid. As it turns out, my server’s certificate is self-signed, which means it is not technically valid and should not be trusted. This will make a lot more sense later in the semester when we talk about PKI/TLS. In the meantime, you need to disable certificate validation in your code in order to ignore this error.
  • I did socket.read(), counted the characters, and sent the count to the server, but then the server closed the connection. I manually checked and my count was correct. What’s happening? Your count wasn’t correct; the question is why was your count incorrect? The problem is that you did not read the entire message from the server, i.e., you did not receive the entire random string. Did you check to make sure the message from the server ended with a newline ("\n")? Just because you call socket.read(), does not necessarily mean you will receive the entire message from the server. You may need to call socket.read() multiple times to receive the entire message.
  • Sometimes when I socket.read(), I just receive random ASCII characters. Why isn’t the server sending me valid FIND or BYE messages? This is the same problem as the previous question. You are probably not reading from the socket until you receive a newline ("\n"). In other words, you socket.read() the first portion of the server’s message, count characters in an incomplete random string, send a COUNT message to the server, and then socket.read() additional characters from random string, but you are misinterpreting the second socket.read() as the start of a new message from the server.