Patent Bots Quality Scores

Patent Bots quality scores are based on the average number of errors found in issued patents. We used our automated patent proofreading to process a year of issued patents for a recent year.

We ranked firms and companies with at least 50 issued patents during this year. We worked really hard to make sure that our quality scores are as fair as possible, and we provide details below.

About Our Quality Scores

How did you do this?

The USPTO makes it easy to download the full text of issued patents. We downloaded issued patents for an entire year and then used our automated patent proofreading to count the numbers of errors in each patent (it took awhile!).

What errors did you count?

Patent Bots proofreading provides errors and warnings in proofreading results. For the purpose of computing quality scores, we used only numbering errors, antecedent basis errors, and word support errors (words that don't appear in the detailed description). We did not use any other errors (e.g., phrase support or reference labels) and we did not use any warnings (anything in yellow in Patent Bots results).

This strategy will both overcount and undercount the actual number of errors in any individual patent. Errors may be overcounted because we occasionally have false alarms in our results. Errors may be undercounted because some warnings are actually errors and because we are not counting all of our errors (e.g., reference label errors).

Although the precise error counts in any individual patent may not be accurate, the average number of errors across all of a firm's patents is a good indicator of the quality of the firm's work.

How do you compute the scores?

Warning, this is highly technical! First, we compute the firm scores for each Tech Center. We do this by computing an average number of errors for each firm in the Tech Center. We then fit a gamma probability distribution to the scores. A firm's score is 100 times one minus the cumulative distribution function value for the average number of errors of the firm. To get a score of 100, a firm must not have any errors at all.

To compute the overall scores that you see on this page, we compute a weighted sum of the firm's scores for each Tech Center. We do this because error rates are significantly higher in some Tech Centers (notably 1600) than other Tech Centers. We feel that this weighted sum provides a fairer score for firms that do a lot of patents in Tech Center 1600. Use of the gamma function allows us to combine the Tech Center scores in a fair way.

Are the scores fair?

We think so. That said, we'll list some things to consider in evaluating the fairness of our scores:

  • The average number of errors in Tech Center 1600 is much higher than for other Tech Centers. Practitioners in this technology area appear to be less strict about antecedent basis and our proofreading does not perform as well with biological sequences. We do try to account for this as explained above, but firms that have a lot of patents in 1600 might have scores that are a little lower than they should be.
  • We count the average number of errors per patent so if your patents have many more claims than average, your error rate may be higher.
  • There are many typos in firm names in issued patents. We have tried to merge similar names, but if your firm is not consistent (e.g., "Smith IP" and "Smith Intellectual Property"), then we may not have merged all your data. If your firm name is very close to another firm name (e.g., differs by only one letter), then it is possible that we merged data of your firm with another firm.
Loading...