TextIn - API中心 - 智能文档抽取

文档中心

查看文档

示例代码

API调试

功能描述

general information extration

智能文档抽取服务-API文档

请求URL

https://api.textin.com/ai/service/v1/entity_extraction

HTTP请求方法（Method）

HTTP POST

请求头说明（Request Headers）

请在HTTP请求中添加以下自定义标头（Header）。

header 名	值
x-ti-app-id	请登录后前往 “工作台-账号设置-开发者信息” 查看 x-ti-app-id
x-ti-secret-code	请登录后前往 “工作台-账号设置-开发者信息” 查看 x-ti-secret-code

URL参数（Parameters）

URL参数指以 {参数名}={参数值} 形式拼接到 URL 上的键值对。它以 ? 开头，不同参数之间使用 & 连接。形如 ?p1=v1&p2=v2

参数名	数据类型	是否必填	允许的值	描述
key	string	否	见描述	要抽取的key，单个key请求示例：/ai/service/v1/entity_extraction?key=姓名；多个key请用逗号拼接，例如：/ai/service/v1/entity_extraction?key=姓名,年龄
table_header	string	否	见描述	表格抽取时要抽取的表格列头，参考key的传参方式

请求体说明（Request Body）

Content-Type: application/octet-stream

支持的文件格式：png, jpg, jpeg, doc, docx, pdf, ofd, xlsx, xls；文档最大处理页数为20页，抽取的key与table_header最大数量之和为30个，超出最大限制优先取key。

请注意，请求体的数据格式为文件的二进制流，非 FormData 或其他格式。文件大小不超过 50M，图像宽高须介于 20 和 10000（像素）之间。

响应体说明（Response）

Content-Type: application/json

JSON结构说明如下：

字段名	类型	描述
version	string	版本号
code	integer	错误码，详见“错误码说明”
message	string	错误信息
duration	integer	推理时间(ms)
result	object
+ category	object
++ field1	string	details 下字段的类型, field1 为 details 下的同名字段
++ row	string	表格类型
+ rotated_image_width	integer	正方向时文档的宽，仅文档为图片时其值有效
+ rotated_image_height	integer	正方向时文档高，仅文档为图片时其值有效
+ page_count	integer	智能文档抽取处理的文档页数，超过最大页数限制时（20页），返回为最大页数
+ image_angle	integer	文档角度，指原文档需要经过顺时针旋转多少度，才能得到正方向的文档，仅文档为图片时其值有效
+ details	object	文档抽取结果
++ field1	object	单个key的抽取结果
+++ value	string	字段识别结果
+++ position	array	识别的value在原图中的坐标是个长度为8的数组 [0,1,2,3,4,5,6,7] (0, 1) 左上角坐标 (2, 3) 右上角坐标 (4, 5) 右下角坐标 (6, 7) 左下角坐标
+++ description	string	字段中文描述
+++ lines
++ row	array	table_header的抽取结果
+++ field2	object	表格字段
++++ value	string	字段识别结果
++++ position	array	识别的value在原图中的坐标是个长度为8的数组 [0,1,2,3,4,5,6,7] (0, 1) 左上角坐标 (2, 3) 右上角坐标 (4, 5) 右下角坐标 (6, 7) 左下角坐标
++++ description	string	字段中文描述
++++ lines

JSON结构示例

{
  "version": "v1.6.5",
  "code": 200,
  "message": "success",
  "duration": 28,
  "result": {
    "category": {
      "field1": "one_to_one",
      "row": "item_list"
    },
    "rotated_image_width": 1000,
    "rotated_image_height": 2000,
    "page_count": 10,
    "image_angle": 90,
    "details": {
      "field1": {
        "value": "字段识别结果",
        "position": [
          100,
          200,
          200,
          200,
          300,
          200,
          100,
          300
        ],
        "description": "字段中文描述",
        "lines": [
          {
            "page": 0,
            "text": "example",
            "pos": [
              100,
              200,
              200,
              200,
              300,
              200,
              100,
              300
            ],
            "char_pos": [
              [
                100,
                200,
                200,
                200,
                300,
                200,
                100,
                300
              ]
            ]
          }
        ]
      },
      "row": [
        {
          "field2": {
            "value": "字段识别结果",
            "position": [
              100,
              200,
              200,
              200,
              300,
              200,
              100,
              300
            ],
            "description": "字段中文描述",
            "lines": [
              {
                "page": 0,
                "text": "example",
                "pos": [
                  100,
                  200,
                  200,
                  200,
                  300,
                  200,
                  100,
                  300
                ],
                "char_pos": [
                  [
                    100,
                    200,
                    200,
                    200,
                    300,
                    200,
                    100,
                    300
                  ]
                ]
              }
            ]
          }
        }
      ]
    }
  }
}

错误码说明

错误码	描述
40101	x-ti-app-id 或 x-ti-secret-code 为空
40102	x-ti-app-id 或 x-ti-secret-code 无效，验证失败
40103	客户端IP不在白名单
40003	余额不足，请充值后再使用
40004	参数错误，请查看技术文档，检查传参
40007	机器人不存在或未发布
40008	机器人未开通，请至市场开通后重试
40301	图片类型不支持
40302	上传文件大小不符，文件大小不超过 50M
40303	文件类型不支持
40304	图片尺寸不符，图像宽高须介于 20 和 10000（像素）之间
40305	识别文件未上传
40306	qps超过限制
30203	基础服务故障，请稍后重试
500	服务器内部错误