UF Guidelines to Develop Applications for Secure Deployment
Applications often serve as the delivery mechanism through which personal data and other sensitive information is transferred online. Unsecured or poorly written applications can be exploited to bypass security measures or used to transfer information that is easily intercepted. The following guidelines outline several steps necessary for application developers to prevent such abuses.
Federal law, state law and UF policies require protection of personal, confidential and sensitive data. Some applicable laws and policies are listed below. Other UF IT policies can be found on the Policies and Standards web site.
- UF is required by the Buckley Amendment and state statute to maintain the confidentiality of student records. For more information, refer to Student Records.
- Units that deal with patient information will need to be aware of their responsibilities under the Health Insurance Portability and Accountability Act of 1996 (HIPAA).
- Financial data is protected by the Gramm-Leech-Bliley Act (GLBA). For more requirements on applications that process financial data, refer to the UF E-Commerce Policy.
Defense in Depth
Security should be implemented at multiple levels to prevent a breach in one level from compromising the entire application. Users should be aware of all the techniques discussed in this document and use multiple techniques where possible and appropriate. See the UF Network and Host Security Standard and Procedures at www.it.ufl.edu/policies/security/uf-it-sec-network.html. Consider access control lists and firewalls for added protection.
Methodology, Review and Testing
It's better to avoid security vulnerabilities than to fix them. Conduct internal peer review or external, third party assessment. Choosing a formal development methodology will impart structure, reduce errors, and encourage review at each stage of development. Developers must demonstrate compliance with UF Network and Host Security Standard and Procedures.
Automated tools can be used for review and testing, but should not replace manual methods. Testing should include the following methods:
- Attempt to impersonate users or servers
- Attempt to perform fraudulent transactions
- Attempt to compromise data
- Attempt to send junk data
- Attempt to compromise server
- Attempt a denial of service
General Application Security
- Authentication, authorization and data permissions. User and application data should be viewable and/or modifiable by only authorized applications and users. Ensure proper protections (file permissions, DB permissions, etc) are used so that unauthorized users or applications are unable to make use of the data in question.
- Encryption. Encryption at the communications layer prevents eavesdroppers on the network from passively watching the application data as it passes from client to server. Very sensitive data could be stored in an encrypted form as well, further protecting against other breaches in the application.
- SSL <see Web Security>
- VPN - While VPNs are becoming more popular, they aren't as widespread as SSL, and in general SSL should be the users first choice for network encryption
- SSH - SSH is a secure replacement for telnet, rlogin, rcp and rsh, which uses strong encryption to prevent login passwords and session data from being compromised. Any application requiring remote terminal logins should use ssh instead of the older, insecure protocols. Client and server versions are available for most operating systems, including Windows.
- Public Key (GPG/PGP/X509) - Use of public key encryption allows for user data to be gathered and encrypted with the collection application without the need for the private (decrypting) key. Another application can then process the data using the private key. This prevents a compromise of the collection application from allowing the decryption of collected data.
- Symmetric Key - Symmetric key encryption can also provided added protection, but the key must be present in all applications requiring encryption or decryption, so protection of the key is paramount.
- Temporary files. Temporary files are subject to attacks if not handled properly. In general, an attacker compromises temp files by creating a world-writable file or symbolic link (unix) with the name of a file the application will open, then either capturing data written there or using operations performed on the symlink to compromise other files on the system. To ensure the application uses temp files securely, use the following guidelines:
- Use temporary files sparingly, if at all.
- Avoid publically writable temporary directories if possible. If using a publically writable directory, make a directory within the publically writable directory for temporary files, with read and write permissions for the application only.
- When generating temporary file names, make the names as hard to guess as possible, eg., use a filename generated by running random data through a digest operation (MD5/SHA1).
- Before creating the file, attempt to remove it and check the return value of the remove operation. If the return value indicates invalid permissions on a file already existing, choose another filename. On UNIX-based systems, use the flags O_CREAT|O_EXCL to create the file. The open operation will fail if the file exists.
- On systems that support it, use the mkstemp() system call, returning the handle to an already-open, securely-created file.
- Ensure temp files are created with permissions that allow only the application to use or delete them.
- Always delete temporary files when finished with them.
Auditability and Logs
The code itself should be easily auditable. This includes following proper programming guidelines and documenting code where the intent is not immediately obvious. Most importantly, however, usage, errors, and abnormal conditions should be tracked with logs that are monitored in some manner. Watching application logs is one of the best ways to detect a number of different cracking attempts, such as password brute forcing, data injection and other forms of data input validation abuse. Proper logging will record failures and the error conditions.
Consider archival of the logs for a reasonable length of time to allow comparison of current logs with previous logs.
If the volume of logs generated by an application are prohibitively large, try automating parsing of the logs to prevent the application developer or system administrator from having to review the entire log by hand. Additionally, coding different verbosity levels within the application for different levels of log monitoring is extremely useful.
Data Input and Validation
Before working with user input, ensure that it is safe by limiting the allowed characters. If direct input were trusted, users could abuse the application for many malicious purposes such as compromising the host or retrieving protected information. For more information, see the following CERT advisories and referenced articles:
- Advisory CA-1997-25: Sanitizing User-Supplied Data in CGI Scripts
- How To Remove Meta-characters From User-Supplied Data In CGI Scripts (CERT Coordination Center article)
- Advisory CA-2000-02: Malicious HTML Tags Embedded in Client Web Requests
- Scriptlet Security (Microsoft Developers Network article)
Never Trust Client Data
- Validate client IP addresses. Never trust the client machine to tell you its IP address or DNS name. If you're restricting a function to certain IP numbers, check the actual remote IP address of the connection. If you're doing restrictions by DNS name, translate that numeric IP address to a name, then translate that name back to a number. Only accept the name if the number you get back matches the number you started with.
- Use server-side validation of all commands.
- Environment variables, like HTTP_REFERER and REMOTE_USER should never be used to implement restrictions, as they are easily spoofed.
- Consider replay attacks as well as spoofing. The attacker does not have to guess what a valid session ID would look like if s/he can just use somebody else's.
HTML tags can be abused to change the display of an application or even execute code on the client machine. If HTML tags are used by users to format text input, always be sure to explicitly define which tags are allowed. For example, tags other than <P>, <B>, <I>, and <FONT> could be stripped.
Buffer Overflows and Memory Management
Any application written in a programming language that requires the developer to allocate and deallocate memory (C, C++) must be written with proper memory management in mind. All memory allocated with malloc(), et al., should be freed when no longer needed. Care should be taken not to free an area of memory more than once. Integer variables that index memory should stay within appropriate bounds, overflow and underflow. See the Microsoft document, Reviewing Code for Integer Manipulation Vulnerabilities, for more information. Buffer sizes should be checked so that no attempts to copy data larger than the buffer itself should occur. For string management, the strl* routines are generally preferred over the strn* routines. Strcpy(), strcat(), et al. should be avoided completely, and developers should use snprintf() over sprintf(). See the paper on strlcpy and strlcat by Todd Miller and Theo de Raadt. Programming languages such as C and C++ have difficulty of correctly programming for memory management. Languages such as Perl and Java use garbage collection and are better at string handling than C/C++. Care must still be taken when passing data to operating system calls, as most operating systems are written in C/C++.
Programmers should never allow the user to specify his own format string:
syslog(LOG_ERR, buf); /* BAD, allows user to specify format string */
This allows the user to send in arguments like the following:
char *garbage = "%s%s%s%s%s%s%s%s <a lot of garbage>"
write(sockfd, garbage, sizeof(garbage))
Which can cause the reading app to move up and down the stack and possibly overwrite areas of memory, similar to the way buffer overflows work.
Be sure all such routines that take a format string have a developer-supplied format string:
read(sockfd, buf, sizeof(buf));
syslog(LOG_ERR, "%s", buf); /* GOOD, developer-supplied format string */
Form implementation: POST vs. GET
When using forms, prefer POST to GET. The GET method transfers all information to the application in the URL.
This information is much more visible to casual observers and is logged in many places. POST, on the other hand, is hidden from the browser screen, is not logged and does not have some of the size and content restrictions of GET.
Data Injection and Cross-site Scripting
Two examples of application abuse when proper data validation does not occur are data injection and cross-site scripting. Data Injection allows the user to specify extra information along with legitimate input, and cross-site scripting is a relatively new form of attack in which data is embedded by a user which is presented to other users and executed in some manner to cause other clients to behave in an undesirable way. For example, a user may post an image in a comment to a forum, which other users browsers will automatically visit, potentially giving away cookie or other session information. For more information on cross-site scripting, see the CERT documents Malicious HTML Tags Embedded in Client Web Requests and Understanding Malicious Content Mitigation for Web Developer.
Important to note: The sample CGI applications that come installed on many Web servers often represent a serious security risk. For more information, see the UF Network and Host Security Standard and Procedures.
Securing Web Services With Secure Socket Layer (SSL)
SSL is a commonly used Web protocol that uses strong encryption to communicate securely, it also allows the browser to authenticate the server and detect whether the data has been altered or tampered with in transit. Fully compatible with the Web's most popular browsers, SSL is used by many major e-commerce Web sites. When users connect to a secure Web server using SSL, such as https://www.someunit.ufl.edu/ (note that the "s" indicates a secure server), that server authenticates itself to the Web browser by presenting a digital certificate. Digital certificates are issued to Web sites by a Certifying Authority (CA) , once the CA is satisfied 1) as to the identity of the requester and 2) that the requester actually owns, or has the right to use, the domain name for which the certificate will be issued. A valid certificate gives customers confidence that they are sending personal information securely and that it is going to the right place.
To build secure applications, particularly web applications, it is imperative that a series of requests from a particular client be associated with each other and that the server be aware when the client is no longer active. This set of precautions is known as session management. Without session management, an unauthorized user could gain access to personal data from UF systems by assuming an idle session.
Session management is especially important for web applications, since they do not establish a persistent connection between client and server, but send each page as a separate network connection. However, even applications which use persistent network connections can benefit from the guidelines below.
Many different techniques may be employed. Your choices will be determined by your platform, server software, application software, programming language, and your desired level of security. Listed below are some ways that UF application developers can use session management techniques to protect data served via UF applications
- Authorization - While not strictly part of session management, authorization is typically done at the beginning of a session. Authorization is the process of determining who may do what.
- Authentication - You must uniquely identify the user at the creation of a session. This means you must log them onto your system or have some way of determining they are already logged onto another trusted application. The recommended practice is to use existing UF authentication methods to identify the user. GatorLink authentication service is a central service provided by the Office of Information Technology and GatorLink IDs are required for all university users. See http://www.it.ufl.edu/identity/shibboleth/ for information about using GatorLink authentication. <RACF> may be a good choice if you need more granularity.
- Preventing Session Hijacking - If an intruder can eavesdrop on the beginning of a session, they may be able to "hijack" that session by using the session's keys or sequence numbers to impersonate the rightful client. To prevent this, sessions involving sensitive data should always be encrypted; and session keys, shared secrets, and TCP sequence numbers must never be easily guessable.
- Tracking the user - Since these techniques are subject to spoofing and replay attacks, use two or more of the following.
- IP Address - The IP address technique associates a session with a client IP address. This technique works well when all clients have fixed IP addresses. It does not work reliably if the client does not have a fixed IP address, for example, if the client uses Network Address Translation NAT. This technique also does not work for clients that use a proxy server, because all users of the proxy report the same IP address. IP addresses can be spoofed. For these reasons, this technique is probably useful only for intranet applications.
- Cookies - The web service sends an HTTP cookie to the client. The client will then return the cookie on each subsequent request to the server. The server should confirm the cookie is valid, even it did not originate from the current server. To be secure, the cookie exchanges must be protected by an SSL-encrypted session, and cookies must be expired promptly by the server after logout. Even then, session cookies can be compromised by cross-site scripting attacks or spyware on the client workstation. Since cookies are easily forged, the server should be able to distinguish cookies it gave out from bogus ones. One way to do this is to place a timestamp, IP address and username in the cookie, make an MD5 or SHA-1 hash of all that information, then sign it with a private key. Then any presented cookie can be validated by verifying the signature. Another way to do this is to keep a record of all recently issued cookies, along with the time issued, the IP address issued to, browser version information and the user. Only accept a cookie if it matches one you know you sent to that location. In general, cookies have limitations on the size of the cookie and the number of cookies an application can have. Cookies also raise privacy concerns which cause many users to disable cookies in their browser preferences. Your site should detect whether cookie support is absent or turned off. You can do this by sending a test cookie and checking for it, as early as possible in the session. If your application detects that cookies are not being returned, an alternate means can be employed to maintain session data, or the user can be told how to enable cookies.
- URL encoding - URL encoding involves rewriting URLs on the fly to include session information. This is very useful when a client refuses to accept cookies. The unique session key is appended to each URL on the page as a name/value pair. When the user clicks a link or submits a form via GET, the key is sent along with the HTTP GET request. URL encoding is easily spoofed, so take appropriate steps to encrypt or validate URL content. As with cookies, you must protect the session key with SSL and expire it promptly. Unlike cookies, the URLs will also show up in web server logs and browser histories, making them even easier to compromise.
- Maintaining all state in browser. It's possible to save the entire state of a web session in hidden form variables, URLs, or cookies, so that the server doesn't have to remember anything between requests (Google searches work this way). This is not recommended unless you encrypt and sign the data stored on the client. Use strong encryption, and use existing encryption available for most major programming languages. The best way to implement encryption is with public key cryptography, which allows you to not only decrypt the data, but verify that your key was used to sign the data in the first place. Symmetric cryptography (same key encrypts/decrypts) may be used, however, be aware that once that secret is known, the security of the application may be compromised. Additionally, encode the data in such a way that a stolen encrypted token is not usable by another user; whether with IP restrictions, timeouts, or other of the methods described in this document. If you absolutely must use this technique because of server limitations, you should re-validate all the session information at each request.
- Maintaining state in server side database -With this technique the state of a web session is stored at the server; the browser only keeps enough information to uniquely identify the session. This is the recommended method. The server could store the information using a database, disk files, persistent objects, or some other method.
- Logout - To protect access to UF data records, the best practice is for users to be asked to logout when they are finished.
- Timeouts - Idle sessions must be invalidated by enforcing timeout periods. The user could be logged out completely, or could be asked to re-authenticate in order to re-activate session. Keep the timeout to a minimum, 5 minutes or even less for very sensitive data services.
How to Set Up a Secure Web Site at UF
Deploying a secure Web service at UF involves a few administrative steps. First, identify which entity is hosting your Web site. If your site is hosted by Computing and Networking Services ( formerly NERDC) on their NERSP machine, then they can handle the whole process for you. Contact the Open Systems Group (firstname.lastname@example.org or (352) 392-2061) to request secure Web services on the NERSP.
- Have your system administrator generate a public/private key pair for the Web server you would like to secure.
- Submit your digital certificate to be signed by a well recognized certifying authority (CA), i.e.: Thawte,Verisign, etc.
- Install the digital certificate on your Web server and create a domain name entry for the secure (https) service.
- Make sure the secured resources can't be accessed by the insecure [http] methods.
- Keep your digital certificate up to date; they are time limited.
If you are hosting your Web site locally on your own machine, then you will have to follow these steps:
Questions about setting up secure Web services should be directed to the NERDC Open Systems Group at email@example.com or (352) 392-2061.
Databases and Other Data Storage
If external files are used in your application, further checks must be done to ensure that malicious users are not exploiting the application to retrieve sensitive information. Use the following guidelines when reading from or writing to external files:
- Authenticate connections from the web server.
- Consider encrypted database connections if the database server is not on the same machine as the web server or on the same local network segment.
- Do not put passwords in scripts. If the script needs a password, for instance, to connect to a database, it should read that password from a protected configuration file.
- Allow backend database access only from the web server if possible.
- Files that are properly protected from unauthorized web access are not necessarily properly protected from other methods of access.
- Use a data directory. Data files should not be placed in the cgi-bin directory or in any directory that can be accessed from a Web browser.
- Do not use raw path and filenames. Never accept user input as the path information for a file. Instead, use some type of identifier for the form which points to an actual file.
- Use absolute paths. Do not assume that an application is being called from a certain directory. Use a fully qualified path to the files your application accesses.
- Specify the mode when opening a file. Configuration files should be specified as read-only, data files should be specified as write-only, and log files should be specified as append-only.
- Avoid temp files
- Maintain current patch levels
Using UF Data Resources
User supplied data is frequently misspelled, incomplete or missing. Applications that rely on personal data supplied by the user are vulnerable to data corruption. Applications should therefore limit the amount of personal data input from users. Instead, it is highly recommended that campus developers require user authentication from official UF systems. Personal data about authenticated users can then be acquired from official university directory resources as described below.
Overview of UF directory service methods
The DB2 database is the authoritative source for University of Florida directory information. It is maintained by Information Systems and resides on a mainframe computer at the Northeast Regional Data Center (NERDC). As the official UF directory, this database contains student and employee demographic data. Access to the official directory is restricted to authorized applications.
The LDAP directory is essentially a copy of the official database and is maintained by NERDC. However, the data model is slightly different and so are the access methods and security restrictions. Most data is public, in fact the "phonebook" function accessible through the main UF Web site is powered by this LDAP data. However, LDAP does contain some secure data, which are available only to authorized users. Also, some users will not be accessible to unregistered applications from LDAP if their privacy flag is enabled.
These directories can be accessed from a variety of programming languages and methods such PERL, Java, etc. Secured access can be obtained by justifying your needs to the appropriate controlling entities. (contact info needed)
Use Enterprise Data
Obtain preferred email addresses, mailing addresses, UFIDs, or other information about UF users from the UF LDAP or DB2 directories. Since all members of the UF community are required to have a GatorLink account, login the user with GatorLink via OIT's Shibboleth authentication server. The GatorLink ID can be extracted from the authentication server response when you validate the user's browsers cookie used to keep track of logins, then used to query the directory.