Introduction to MIME
MIME (Multipurpose Internet Mail Extensions) is a standard which was proposed by Bell Communications in 1991 in order to expand upon the limited capabilities of email, and in particular to allow documents (such as images, sound, and text) to be inserted in a message. It was originally defined by RFCs 1341 and 1342 in June 1992.
Using headers, MIME describes the type of message content and the encoding used.
MIME adds the following features to email service:
- Be able to send multiple attachments with a single message;
- Unlimited message length;
- Use of character sets other than ASCII code;
- Use of rich text (layouts, fonts, colors, etc)
- Binary attachments (executables, images, audio or video files, etc.), which may be divided if needed.
MIME uses special header directives to describe the format used in a message body, so that the email client can interpret it correctly:
- MIME-Version: This is the version of the MIME standard used in the message. Currently only version 1.0 exists.
- Content-type: Describes the data's type and subtype. It can include a "charset" parameter, separated by a semi-colon, defining which character set to use.
- Content-Transfer-Encoding: Defines the encoding used in the message body
- Content-ID: Represents a unique identification for each message segment
- Content-Description: Gives additional information about the message content.
- Content-Disposition: Defines the attachment's settings, in particular the name associated with the file, using the attribute filename.
Primary MIME types
MIME types, used in the Content-Type header, are used to classify documents attached to an email. A MIME type is comprised as follows:
A GIF image, for example, has the following MIME type:
The primary data types, sometimes called "discrete data types," are:
- text: readable text data text/rfc822 [RFC822]; text/plain [RFC2646]; text/html [RFC2854] .
- image: binary data representing digital images: image/jpeg; image/gif; image/png.
- audio: digital sound data: audio/basic; audio/wav
- video: video data: video/mpeg
- application: Other binary data: application/octet-stream; application/pdf
MIME types are also used on the Web to classify documents transferred using the protocol HTTP. Thus during a transaction between a web server and a browser, the first thing the web server does is send the MIME type of the file to the browser, so that the browser knows how to display the document.
To transfer binary data, MIME offers five encoding formats which can be used in the header transfer-encoding:
- 7bit: 7-bit text format (for messages without accented characters);
- 8bit: 8-bit text format;
- quoted-printable: Quoted-Printable format, recommended for messages which use a 7-bit alphabet (such as when there are accent marks);
- base64: Base 64, recommended for sending binary files as attachments;
- binary: binary format; not recommended.
Since MIME is very open, it can use third-party encoding formats such as:
The transfer-encoding header is used to specify an encoding format for the message body, but it doesn't solve the problem of encoding headers themselves (such as the message subject).
To encode headers with character sets which use more than 7 bits, such as for including accented letters in an email's subject, the MIME standard offers the following format:
- charset represents the character set used,
- encoding defines the encoding desired with two possible values:
- Q for quoted-printable
- B for base64
- result: text encoded using the method specified.
Below is an example of Quoted-Printable
encoding with "Building faÃ§ade" as the email's subject.
Subject: Building fa=?ISO-8859-1?Q?=E7ade?=
With the MIME type "multipart", the MIME standard allows for composite messages, meaning messages which include multiple attachments, which may even be nested.
To do so, MIME allows for a standard called boundary. This is an arbitrary string defined as an attribute in the Content-type header:
Each separator delimits a portion of content beginning with the headers Content-Type
. It is essential that the value of this separator is not found within the message contents.
There are several types of separators:
- multipart/mixed defines a series of multiple elements
- multipart/alternative defines alternatives for the same information, such as a message in either text and HTML format. If the email client is able to display messages with a layout and configured to do so, it will show the HTML version; otherwise, it will display the test version.
- multipart/parallel defines data present at the same time (such as sound and image).
- multipart/signed defines a digital signature for message data
- multipart/related defines related pieces of information
List of MIME types
MIME types are standardized by a group called the IANA (Internet Assigned Numbers Authority). Here is a non-exhaustive list of the most common MIME types.
||Type of file
||Files in ATOM format
||IGES CAD exchange format
||Non-interpreted binary files
||Microsoft Word document files
||Adobe Acrobat files
||Rich text format
||Microsoft Excel spreadsheet files
||Microsoft Powerpoint presentation files
||Compressed tar files
||Compressed ZIP files
||Basic audio files
||MPEG audio files
||MPEG-4 audio files
||AIFF audio files
||Wave audio files
||PBM Bitmap files
||PBM Graymap files
||PBM Pixmap files
||Zip archive files
||GNU zip archive files
||Comma-separated text files
||Unformatted text files
||Rich text files
||Rich Text Format text files
||Tab-separated text files
||Microsoft Windows videos
- Official list of MIME types (IANA - )
- Main RFCs:
- RFC 2045: MIME Part One: Format of Internet Message Bodies
- RFC 2046: MIME Part Two: Media Types
- RFC 2047: MIME Part Three: Message Header Extensions for Non-ASCII Text
- RFC 2048: MIME Part Four: Registration Procedures
- RFC 2049: MIME Part Five: Conformance Criteria and Examples
- Secondary RFCs:
- RFC 1524: The formal description of mailcap files. Mailcap files describe how to handle media types.
- RFC 2015: MIME Security with Pretty Good Privacy (PGP).
- RFC 2110: MIME E-mail Encapsulation of Aggregate Documents, such as HTML (MHTML).
- RFC 2111: Content-ID and Message-ID Uniform Resource Locators.
- RFC 2112: The MIME Multipart/Related Content-type.
- RFC 2183: Defines the syntax and sematics of the "Content-Disposition" header to convey presentational information.
- RFC 2184: MIME Parameter Value and Encoded Word Extensions: Character Sets, Languages, and Continuations