Storing Student Stuff, Part I

For the last few months, I've been working on a project to provide some file storage space to all 12,500 of our students here at UNI. Though many (most?) other schools do provide storage space, we've never done it in a central organized fashion, though a few of the Colleges have dabbled with it over the years. I've met with the IT staff from the colleges to try to build a list of requirements, and met with students to ask what they want... Then, I tried to figure out what technology to use to meet those requirements. The obvious one is the typical file server, but one of the requirements I got back was high availability, which really translates to clustering, and clustering tends to translate to being a pain in the backside to set up and maintain, and the complexity it introduces can often turn into more downtime with configuration issues than the clustering prevents during hardware failures.

So, that got me thinking about Filer technology, something I'd never worked with, but I knew that Joe Breu, a former sysadmin at our municipal cable utility who now works for Rackspace did. I asked him about it, and he suggested I look at a NetApp filer. I was quite impressed, but I've also been looking at NAS type devices from EMC and Sun as well.

So how do all these choices stack up? What are our requirements? What is the best fit? What can we afford? Which of the requirements are "hard" requirements and which are just wishes? How much space does each student need? How big should we make the disk quotas? What percentage of students will use the service? What percentage of students will utilize their entire disk quota? How are we going to back all this storage up? What will backing this up cost? These are the questions I've been attempting to answer over the last few weeks, so I decided to blog it all to try and help keep it all straight.

The first thing we tried to establish is what protocols were possible for accessing the data by students.

  • CIFS - This is a no brainer, all the Windows boxes will use it
  • NFS - Probably not a hard requirement, but definitely a nicety for the Linux/UNIX users and Macintoshes.
  • WebDAV - This would provide the ability for students to mount their files as a drive while off-campus or on their personal machines.
  • HTTP/Web Access - Some sort of web page for students to easily upload and download a file, for quick remote access
  • SFTP - So tech-savvy students can easily upload/download large numbers of documents remotely, yet not broadcast their passwords in plain-text like FTP would
  • So, obviously these things can be accomplished on a traditional server, but, trying to make them all work together seamlessly would be difficult. Having NFS & CIFS permissions coexist can be tricky, and I have yet to find a good web access application for sharing files from a filesystem on the web. There are plenty for storing files in a database, but I need to do it from a filesystem...

    Continue to Part II.