Generally, a file’s type is identified by server side to check if this file is legal for storing. For instance, a cloud document editor can only allow users to import .doc/.docx/.pages files.

However, only detecting the expanded-name is not enough, some of users can change the file expanded-name to avoid this detection. Therefore, we have to find a solution to address this unsafe issue.

## Could a file’s content can be modified ?

If we just modify the file’s expanded-name, the file’s content won’t be changed. We can do an experiment. When we try to play a video file with a .pdf extension, it can still play well. In other words, the video player has another method to detect whether it is a video file or not. After searching on Google, I found that every file has a File Signature (or Magic Number), which represents for the real type of a file. Fortunately, it is constantly embedded in the header of a file (first 4~8 bytes).

However, most solutions provided online are implementation on server side. It is not a good way to use. Only the file that has been uploaded to the server can be detected. A large file has to be waited for a long time to upload, which might make users mad. Therefore, I intended to find a front-end solution.

## How to detect file type through HTML5 ?

Thanks to File API of HTML5, we can easily get the file signature of a file.

Here is the source code:

## In Practice

However, when I use it in practice, the file signature’s length is not constant. Some of files only need first 4 bytes to detect, but others need first 8 bytes. Even worse, some files’ signatures begin from the 512 bytes. How to use a universal solution to detect these different signatures ?

My current solution is that establishing a signatures library at first, and then using the expanded-name of files to match the signature of such kind of extension. If the signature can’t match with the expanded-name, it will be considered illegally.

This is the signatures library I create:

Implementation:

## Summary

Detecting file type through front-end may be the fastest way, but there are also some cons.

### 1. Part of files has the same signatures

For instance, the signatures of files that Microsoft Office create (.xlsx,.docx,.pptx, etc.) are equal to that of zip files.

Test:

File Type Successfully Intercepted
mp4 to pdf
zip to docx
zip to jpg
docx to zip
docx to xlsx

### 2. Compatibility

Unfortunately, HTML5’s FileReader is not supported by all browsers.

Feature Firefox (Gecko) Chrome Internet Explorer* Opera* Safari
Basic support 3.6 (1.9.2) 7 10 🚫 🚫

### 3. File Signature Library

The library of file signatures can be found on File Signature Database, but not all the files has a signature on it, even though this website keeps updating them.