Custom UF Spider SSNs & Identifiable Medical Information
Spider is an open source security tool created by Cornell University for the discovery of sensitive data (SSNs, CCNs) on a computer system. This version has been customized by the UF HSC Information Security office to minimize the number of steps the user must take to begin scanning and for the specific use of finding Identifiable Medical Information (including SSNs and Patient IDs).
While the custom UF HSC version of Spider has been optimized to decrease false-positive results, they WILL still occur.
From the Cornell Spider official documentation:
“Spider will misidentify certain types of files as containing confidential data. Every effort should be made to verify Spider’s results before moving, encrypting, or removing files.”
- Microsoft Windows 2000, XP, Server 2003, or Vista
- Microsoft .NET Framework 1.1 or greater
- Administrative Privileges
- Download Spider
- Run UFHSC_Spider_MedInfo_Setup.msi
- Carefully follow the on-screen instructions.
- From the start menu, choose Programs > Spider > “Spider” Important: You MUST have administrator rights to run Spider. In Vista, right-click->”Run as administrator”
- Click the “Run Spider” button. (See “Tips:” below for runtime info)
- When finished, Spider will display the log. By default, the logs are stored in C:\SPIDER.log. This log contains the paths of the files and the names of any matching rules to help you determine why that file was added to the log.
- Use the log to confirm which files contain an SSN, and be sure to remove or move the log file to a secure location after use.
- Spider can also be run through the command-line for unattended execution (such as on multiple hosts). See: Command-line options.
- Execution time varies greatly and depends mostly upon the type of hardware and amount of data being scanned. Expect anywhere between 15 minutes and 2 hours, but in some cases Spider can take 3 hours or more. Overnight scanning may be more practical on older systems and/or those with large amounts of data.
- By default, Spider will temporarily cripple most machines as it uses most of the system resources. This can be changed in the options dialog, but be aware that execution time will increase.
- Also by default, Spider scans temporary internet files in order to find poorly programmed web-applications that cache SSNs. Execution time can be sped up by clearing the cache, but such web-apps will not be uncovered.
- Spider will determine before-hand how many files it will attempt to scan and display the number at the top of the program window.
- By default, Spider will scan the local C:\ drive. If you would like to change this, (such as to scan a mapped network drive) or any other options (such as regular expressions) consult the official Spider Documentation.
- The Spider log file is Comma-delimited by default. As such, it can be imported into most spreadsheet applications (such as Excel) as a CSV file.
- Spider can be removed through “Add/Remove Programs” (XP) or “Programs and Features” (Vista) under the Control Panel