Ben Bowen's Blog • Home / Blog • • About • • Subscribe •

Answering "What Great .NET Developers Ought To Know" - Everyone Who Writes Code 

12 years ago in 2005, Scott Hanselman published an article on his blog entitled "What Great .NET Developers Ought To Know". For fun, I've decided to answer the first four sections (one per post).

What Everyone who Writes Code Ought to Know 

Describe the difference between a Thread and a Process? 

Answer: Threads run 'inside' processes and all share the process's memory. Processes are 'containers' for threads and represent applications/executables as a whole.
Exploration: This question is actually not as cut-and-dry as it might first seem. The short answer above is probably what an interviewer might be looking for but the reality is more complex. First of all, processes are more than just 'thread containers'. They also 'own' any I/O handles that any thread might create, as well as some global security contexts (though threads often have their own security contexts as well). Furthermore, processes can share memory too; every major OS has a mechanism for this. On Windows it can be done via Named Memory-Mapped Files/Named Shared Memory, and now that .NET is cross-platform, we have to consider what 'thread' means outside the Windows environment. I'm far from a Linux guru but as far as I know the Linux kernel doesn't actually have any concept of a thread, only processes. When the user wants to spawn a 'thread' in linux, they're simply asking the kernel to create a new process that shares the same memory and state as the process they're forking from. This all gives us something to think about: If processes can share memory, what actually is the difference between a Thread and a Process? Ultimately I think they're both just names we give to similar things but in a way that helps us understand the partitioning of virtual memory.

What is a Windows Service and how does its lifecycle differ from a "standard" EXE? 

Answer: Windows services are applications that can only be started and stopped via the Service Control Manager. They generally (but not necessarily) run for the entire time the host machine is turned on.
Exploration: I recently built a Windows service as a 'process monitor' to ensure that the company I work for's SCADA application was always restarted if it was shutdown (damn those devious users ;)). Something that I didn't realise until recently is that since Vista, services have absolutely no way of interacting with the desktop. A service can not have a GUI or console of any kind, but also the same applies to any applications started by a service. In fact, services these days run in their own user session (Session 0)- and there's almost no way to interact with standard user-space. Services also must be installed- you can not simply point the SCM at an executable. In terms of lifecycle, the service EXE can not be run like a regular application- the Main entry-point only attempts to dispatch further 'ServiceMain' entry points back to the SCM. As the documentation says, an error is returned "if the program is being run as a console application rather than as a service".

What is the maximum amount of memory any single process on Windows can address? 

Answer: On 32-bit desktop OSs, 2GB unless the process is configured as "large address space aware" in which case 3GB. On 64-bit desktops, it's >= 8TB. On Windows Servers, the values change according to the license, OS version, and other factors.
Exploration: I'm mostly going to talk about Windows Desktop here as that's what I most familiar with: Some of this information may be wrong for server builds. Back in 2005 the answer to this question was easier. Although 64-bit CPUs were available I don't believe they had much market penetration yet, so the answer was simply either 2GB or 3GB. By default processes on 32-bit Windows can access 2GB- that's half the range of a 32-bit word. The other half of the virtual memory space is reserved for Windows OS code/memory- mostly so that your application can call in to those functions without having to incur costly page swaps. However, with a special boot option and field set in the executable header, it is possible for an app on 32-bit Windows to access 3GB of its own memory (Windows squeezes its own data in to the remaining 1GB). On 64-bit Windows applications can access a much larger range of addresses; initially 8TB, and as of Windows 8.1 it's now 128TB (that's a lot of RAM! Still doesn't stop some people from assuming it'll never grow even further though- history repeats itself). ;)

What is the difference between an EXE and a DLL? 

Answer: EXEs represent applications that can be run, whereas DLLs represent shared codebases to be used by those applications.
Exploration: There's not too much else to say here. The actual format for EXE/DLL files is Portable Executables; for .NET applications the entry point calls in the the CLR's main function; which then takes over.

What is strong-typing versus weak-typing? Which is preferred? Why? 

Answer: Strong typing is where the compiler enforces every variable you write to only be assignable a value whose 'type' matches the type the variable is declared as - or some variable more derived (where polymorphism is supported). Weak typing tries to coerce (force) an assignment of any value into the type of the variable/parameter it's being assigned to. Both approaches are preferred in different scenarios (and somewhat according to programmer preference).
Exploration: Actually, there doesn't seem to be any real set-in-stone definition for strong and weak typing; and there's confusion around static/dynamic as well (and whether or not they all mean the same thing or something different)- but I think anyone who isn't a stickler for pedantry can at least accept that the answer above is generally on the right track. As for which is preferred, personally I would only use weak typing for light scripting work and stuff like simple websites (this blog is an example: I wrote it in PHP). Strong typing really shines when the application is large or requires a certain element of security/integrity (e.g. SCADA/systems/defence etc. which is where I work mostly). One reason why strong typing is superior here is that it eliminates an entire breed of programmer error; but a more subtle (and probably more important) reason is that it tends to enforce more structure in the design of the codebase (when one is familiar with strong typing at least). I find weakly-typed languages are more permissive when it comes to just hacking out a braindump of exactly what you need: Very useful when you have something you just want to get done that doesn't have a certain requirement for quality.

What is a PID? How is it useful when troubleshooting a system? 

Answer: A Process ID. Useful only really for measuring metrics in soak tests or attaching profilers/debuggers.
Exploration: There's not much to say here. If you wanna manipulate processes programmatically, the Process class is your friend.

How many processes can listen on a single TCP/IP port? 

Answer: One.
Exploration: For TCP, each port can only be listened to by a single owner; because of the reliable-stream-based nature of TCP. For example, what would happen if one application read a TCP packet in time (before the network hardware buffer dropped it) but another did not? There is an option to forcibly rebind to an existing socket in most socket libraries; as far as I know however this is used to allow a lingering socket from a forcibly closed application to be quickly re-appropriated by that application when it is restarted- and not so that multiple applications can 'share' a stream.

What is the GAC? What problem does it solve? 

Answer: The Global Assembly Cache is where common .NET framework assemblies are stored. It's usually at C:\Windows\assembly\. The problem it's aiming to solve is something called DLL Hell.
Exploration: Although it looks like an everyday directory it's actually more internally structured- navigating to it in Windows Explorer will subtly reveal this as the explorer shell's menus disappear. Really only core framework assemblies should go here (although it is possible to install assemblies in to the GAC); unless your dependencies are unwieldy to manage or take up a gargantuan amount of space I prefer simple self-contained deployments. Every assembly in the GAC must be strongly named, but more on that in the next post. ;)