One Step at a Time

There several features I needed to implement for this project that I hadn’t done previously:

  • serving actual html, rather than just instruction text or JSON responses
  • distinguishing and using both GET and POST methods
  • and then of course the core “problem” of allowing a user to upload a file and figuring out how to extract data from that file.

I thus broke the problem down into small achievable pieces.

First I focused simply on serving html. I defined a file called “formfile.gtpl”, and after some poking settled on putting it in a new directory in the project root called static. The early version I used looked something like this:

<html>
	<head><title>File Metadata Service</title>
	</head>
	<body>
		<form action = "/files" method = "POST">
			Testing:<input type="text" name="test" />
			<input type="submit" value="Submit" />
		</form>
	</body>
</html>

Initially my handler simply wrote this file to the page on every request. Done.

The POSTman cometh

Next I needed to actually handle both GET and POST methods. A chapter from a book about building web applications with Go proved useful. A simple if/else checking r.Method was sufficient. For a GET request, I served the html file. For a POST request… I parsed the form, then echoed the value of my “test” value: io.WriteString(w, r.Form["test"][0]).

Nothing happened! Eventually I narrowed this down to a matter of routes: while my default route to this service was /files/, the POST request was trying to request /files. I don’t entirely understand the why here, but routing both requests to the files handler proved sufficient to get things working.

I was GETing and POSTing like a PRO.

Files Full of …Them

I added a file input to my form, simple enough, and turned to the matter of dealing with it in my code. My early research had turned up a promising standard library function call, FormFile. However after struggling with it for a while I realized that despite the documentation saying that “FormFile calls ParseMultipartForm and ParseForm if necessary.”, I seemed to need to call ParseMultipartForm myself. Perhaps I missed something.

I dug into the details of the data returned by FormFile. I had immediate access, through the header, to the filename and the file headers, but size, my main goal, wasn’t present in those headers. I did start echoing the filename back as a proof of concept. Deep into the documentation: the backing structure might be an *os.File (which I could treat like any ordinary file, and get the size from), but it might not. Bummer. All I really had access to was the interfaces defined on multipart.File:

type File interface {
	io.Reader
	io.ReaderAt
	io.Seeker
	io.Closer
}

I knew how to use a Reader, so I tried just reading from a file, making a buffer and loading the data into the buffer…then checking the size. That might have worked, but it didn’t feel like the right approach. I didn’t care about the actual data, only how much of it there was.

On the Shoulders of Giants

Some googling around the problem turned up a lovely guide to doing precisely what I wanted. In short, the author recommended attempting to type switch to os.File, grabbing the size the normal way if that succeeded, and otherwise using the Seek method to find the size. The relevant code looks like this:

var size int64
switch t := file.(type) {
case *os.File:
	stats, _ := t.Stat()
	size = stats.Size()
default:
	bytes, _ := file.Seek(0, 2)
	size = bytes
}

The somewhat arcane file.Seek(0, 2) means, starting from the beginning, scan through the file until the end, and return the number of bytes passed over. Perfect!

Wrap It Up!

Only a few things remained. Though the user stories don’t required it, I added both the filename and the Content-Type to the JSON response. I had immediate access to them anyway through the file header, so why not?

I had also early on accidentally clicked “Submit” without actually picking a file and crashed the program, so I dealt with it now. It feels like a bit of a kludge, but I simple redirected back to the main route and quit the handler, which will then immediately be fired again. I decided to print no error to the user.

Finally, I added a slightly more helpful message on the main html page.

Room for Improvement

  • a bit of refactoring love wouldn’t hurt
  • investigate why /files/ vs /files matters
  • what other file metadata could reasonably be extracted?
  • there is a memory limit by nature of how ParseMultipartForm works, so reporting some kind of error message (via JSON?) to users who attempt to upload large files is probably a good idea