API Documentation

Reader class

Base class for EasyOCR

Parameters

  • lang_list (list) - list of language code you want to recognize, for example ['ch_sim','en']. List of supported language code is here.
  • gpu (bool, string, default = True) - enable GPU
  • model_storage_directory (string, default = None) - Path to directory for model data. If not specified, models will be read from a directory as defined by the environment variable EASYOCR_MODULE_PATH (preferred), MODULE_PATH (if defined), or ~/.EasyOCR/.
  • download_enabled (bool, default = True) - enable download if EasyOCR is not able locate model files
  • user_network_directory (bool, default = None) - Path to user-defined recognition network. If not specified, models will be read from MODULE_PATH + '/user_network' (~/.EasyOCR/user_network).
  • recog_network (string, default = 'standard') - Instead of standard mode, you can choose your own recognition network --- Tutorial to be written about this.
  • detector (bool, default = True) - load detection model into memory
  • recognizer (bool, default = True) - load recognition model into memory

Attribute

  • lang_char - Show all available characters in current model

readtext method

Main method for Reader object. There are 4 groups of parameter: General, Contrast, Text Detection and Bounding Box Merging.

Parameters 1: General

  • image (string, numpy array, byte) - Input image
  • decoder (string, default = 'greedy') - options are 'greedy', 'beamsearch' and 'wordbeamsearch'.
  • beamWidth (int, default = 5) - How many beam to keep when decoder = 'beamsearch' or 'wordbeamsearch'
  • batch_size (int, default = 1) - batch_size>1 will make EasyOCR faster but use more memory
  • workers (int, default = 0) - Number thread used in of dataloader
  • allowlist (string) - Force EasyOCR to recognize only subset of characters. Useful for specific problem (E.g. license plate, etc.)
  • blocklist (string) - Block subset of character. This argument will be ignored if allowlist is given.
  • detail (int, default = 1) - Set this to 0 for simple output
  • paragraph (bool, default = False) - Combine result into paragraph
  • min_size (int, default = 10) - Filter text box smaller than minimum value in pixel
  • rotation_info (list, default = None) - Allow EasyOCR to rotate each text box and return the one with the best confident score. Eligible values are 90, 180 and 270. For example, try [90, 180 ,270] for all possible text orientations.

Parameters 2: Contrast

  • contrast_ths (float, default = 0.1) - Text box with contrast lower than this value will be passed into model 2 times. First is with original image and second with contrast adjusted to 'adjust_contrast' value. The one with more confident level will be returned as a result.
  • adjust_contrast (float, default = 0.5) - target contrast level for low contrast text box

Parameters 3: Text Detection (from CRAFT)

  • text_threshold (float, default = 0.7) - Text confidence threshold
  • low_text (float, default = 0.4) - Text low-bound score
  • link_threshold (float, default = 0.4) - Link confidence threshold
  • canvas_size (int, default = 2560) - Maximum image size. Image bigger than this value will be resized down.
  • mag_ratio (float, default = 1) - Image magnification ratio

Parameters 4: Bounding Box Merging

This set of parameter controls when adjacent bounding boxes merge with each other. Every parameters except 'slope_ths' is in the unit of box height.

  • slope_ths (float, default = 0.1) - Maximum slope (delta y/delta x) to considered merging. Low value means tiled boxes will not be merged.
  • ycenter_ths (float, default = 0.5) - Maximum shift in y direction. Boxes with different level should not be merged.
  • height_ths (float, default = 0.5) - Maximum different in box height. Boxes with very different text size should not be merged.
  • width_ths (float, default = 0.5) - Maximum horizontal distance to merge boxes.
  • ocr_parameter
  • add_margin (float, default = 0.1) - Extend bounding boxes in all direction by certain value. This is important for language with complex script (E.g. Thai).
  • x_ths (float, default = 1.0) - Maximum horizontal distance to merge text boxes when paragraph=True.
  • y_ths (float, default = 0.5) - Maximum verticall distance to merge text boxes when paragraph=True.

Return list of results

detect method

Method for detecting text boxes.

Parameters

  • image (string, numpy array, byte) - Input image
  • min_size (int, default = 10) - Filter text box smaller than minimum value in pixel
  • text_threshold (float, default = 0.7) - Text confidence threshold
  • low_text (float, default = 0.4) - Text low-bound score
  • link_threshold (float, default = 0.4) - Link confidence threshold
  • canvas_size (int, default = 2560) - Maximum image size. Image bigger than this value will be resized down.
  • mag_ratio (float, default = 1) - Image magnification ratio
  • slope_ths (float, default = 0.1) - Maximum slope (delta y/delta x) to considered merging. Low value means tiled boxes will not be merged.
  • ycenter_ths (float, default = 0.5) - Maximum shift in y direction. Boxes with different level should not be merged.
  • height_ths (float, default = 0.5) - Maximum different in box height. Boxes with very different text size should not be merged.
  • width_ths (float, default = 0.5) - Maximum horizontal distance to merge boxes.
  • add_margin (float, default = 0.1) - Extend bounding boxes in all direction by certain value. This is important for language with complex script (E.g. Thai).
  • optimal_num_chars (int, default = None) - If specified, bounding boxes with estimated number of characters near this value are returned first.

Return horizontal_list, free_list - horizontal_list is a list of regtangular text boxes. The format is [x_min, x_max, y_min, y_max]. free_list is a list of free-form text boxes. The format is [[x1,y1],[x2,y2],[x3,y3],[x4,y4]].

recognize method

Method for recognizing characters from text boxes. If horizontal_list and free_list are not given. It will treat the whole image as one text box.

Parameters

  • image (string, numpy array, byte) - Input image
  • horizontal_list (list, default=None) - see format from output of detect method
  • free_list (list, default=None) - see format from output of detect method
  • decoder (string, default = 'greedy') - options are 'greedy', 'beamsearch' and 'wordbeamsearch'.
  • beamWidth (int, default = 5) - How many beam to keep when decoder = 'beamsearch' or 'wordbeamsearch'
  • batch_size (int, default = 1) - batch_size>1 will make EasyOCR faster but use more memory
  • workers (int, default = 0) - Number thread used in of dataloader
  • allowlist (string) - Force EasyOCR to recognize only subset of characters. Useful for specific problem (E.g. license plate, etc.)
  • blocklist (string) - Block subset of character. This argument will be ignored if allowlist is given.
  • detail (int, default = 1) - Set this to 0 for simple output
  • paragraph (bool, default = False) - Combine result into paragraph
  • contrast_ths (float, default = 0.1) - Text box with contrast lower than this value will be passed into model 2 times. First is with original image and second with contrast adjusted to 'adjust_contrast' value. The one with more confident level will be returned as a result.
  • adjust_contrast (float, default = 0.5) - target contrast level for low contrast text box

Return list of results