PATSTAT (i.e. EPO Worldwide PATent STATistical Database) is a single patent statistics raw database, held by the European Patent Office (EPO) and developed in coop­eration with the World Intellectual Property Organisation (WIPO), the OECD and Eurostat. PATSTAT provides raw patent data coming from around 90 patent offices worldwide, in­cluding of course the most important and largest ones such as the European Patent Office (EPO) and the United States Patent and Trademark Office (USPTO).
The data set includes the full set of bib­liographic variables concerning each pat­ent ap­plica­tion, in particular:

  • Priority, application, and publication number and dates
  • Title and abstract
  • Designated states for protection
  • Status of application
  • Main and secondary International Patent Classification (IPC) codes
  • Applicant’s name and address
  • Inventors’ names and addresses
  • References (citations) to prior-art patents and to non-patent literature

A major problem with PATSTAT is that data are provided in a raw format. Data coming from PATSTAT have been therefore thoroughly elaborated by CRIOS to produce a cleaned and harmonized database: PATSTAT-CRIOS1. Data process­ing consisted mainly in a thorough work of clean­ing and standardization of rough in­forma­tion provided by the EPO.
Such work of name standardization has been carried out at the level of individual inventors and applicants.
In addition to this, each patent document also reports further information not included in Patstat, (FI concordance tables to convert IPC codes into more aggregated and manageable technological classes2 or NUTS3).
Data included in these reports are for EPO patent office only; last update is from PATSTAT DVD set released on 10/2016; starting date for EPO applications is 1978, bytheway in many reports by priority date you'll meat earlier dates.

Yearly time series (by country, region, or company) always begin with the first year of activity (first year with a non-zero figure) and end with the last year of activity (last year with a non-zero figure).
That is, zero-years are trimmed at both ends of the time series.

1 For a detailed description of the algorithm please refer to Coffano, Monica and Tarasconi, Gianluca, Crios - Patstat Database: Sources, Contents and Access Rules (February 1, 2014). Available at SSRN: http://ssrn.com/abstract=2404344


2 Among them, we have often adopted a technol­ogy-ori­ented classifi­cation, jointly elabo­rated by Fraunhofer Gesellschaft-ISI (Karlsruhe), Institut National de la Propriété In­dus­trielle (INPI, Paris) and Observatoire des Sciences and des Techniques (OST, Paris). This classifica­tion aggre­gates all IPC codes into thirty tech­nology fields.



