Skip to content

Commit 7062e99

Browse files
committed
Add docparser code and update readme file
1 parent 4e62b94 commit 7062e99

23 files changed

Lines changed: 4763 additions & 3 deletions

README.md

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# DocParser: Hierarchical Structure Parsing of Document Renderings
22
## Codes for the system presented in "DocParser: Hierarchical Structure Parsing of Document Renderings"
3-
[Updated paper](docparser.pdf)
3+
[paper](docparser.pdf)
44

55

66
### Installation and requirements
@@ -35,13 +35,15 @@ To setup via Anaconda, please follow these steps:
3535
- type `python setup.py develop`
3636

3737
7. Prepare the datasets:
38-
- Download arxivdocs-target and ICDAR files as shown on https://github.com/DS3Lab/arXivDocs
38+
- Download arxivdocs-target from https://github.com/DS3Lab/arXivDocs
39+
- To run the ICDAR demo, download the prepared files from:
40+
https://drive.google.com/file/d/1SdGTq80eUGqUJBA6kdVQBO9L6a_ijAcN/view?usp=sharing
3941
- Extract datasets to the `DocParser` subdirectory
4042
- (resulting in structure: `DocParser/datasets`).
4143

4244
8. Prepare the trained models:
4345
- Download from URL:
44-
https://drive.google.com/file/d/1Hi4-tg4Zmtx8zYiCg6IBi47R88PdmAW4/view?usp=sharing
46+
https://drive.google.com/file/d/1Hi4-tg4Zmtx8zYiCg6IBi47R88PdmAW4/view?usp=sharing
4547
- Extract the pretrained models to the `default_models` subdirectory in `DocParser/docparser/`
4648
- (resulting in structure `DocParser/docparser/default_models/`).
4749
- For convenience, we include the COCO pre-trained weights from from https://github.com/matterport/Mask_RCNN/releases/download/v2.0/mask_rcnn_coco.h5 in the zip file
@@ -63,6 +65,9 @@ To setup via Anaconda, please follow these steps:
6365

6466
### Evaluations
6567

68+
#### arXivDocs
69+
The results of our current system on arXivDocs-target is likely to perform better than the one evaluated in the last version of the paper, mostly due to further improvements to postprocessing.
70+
6671
#### ICDAR 2013, Table Structure Recognition
6772
Updated Results. We corrected a read-out error on the outputs of the provided evaluation script for documents with multiple tables.
6873

@@ -74,6 +79,10 @@ Updated Results. We corrected a read-out error on the outputs of the provided ev
7479

7580
(PDF-based system F1: 0.9221)
7681

82+
### Credits
83+
Parts of our code is based on:
84+
https://github.com/rafaelpadilla/Object-Detection-Metrics
85+
https://github.com/matterport/Mask_RCNN
7786

7887
### Reference
7988
Rausch, J., Martinez, O., Bissig, F., Zhang, C., & Feuerriegel, S. (2019). DocParser: Hierarchical Structure Parsing of Document Renderings. http://arxiv.org/abs/1911.01702

demos/demo_inference.py

Lines changed: 374 additions & 0 deletions
Large diffs are not rendered by default.

docparser/__init__.py

Whitespace-only changes.

docparser/logging.conf

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
[loggers]
2+
keys=root,simpleExample
3+
4+
[handlers]
5+
keys=consoleHandler
6+
7+
[formatters]
8+
keys=simpleFormatter
9+
10+
[logger_root]
11+
level=DEBUG
12+
handlers=consoleHandler
13+
14+
[logger_simpleExample]
15+
level=DEBUG
16+
handlers=consoleHandler
17+
qualname=simpleExample
18+
propagate=0
19+
20+
[handler_consoleHandler]
21+
class=StreamHandler
22+
level=DEBUG
23+
formatter=simpleFormatter
24+
args=(sys.stdout,)
25+
26+
[formatter_simpleFormatter]
27+
format=%(asctime)s - %(name)s - %(levelname)s - %(message)s
28+
datefmt=
Lines changed: 228 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,228 @@
1+
from docparser.objdetmetrics_lib.utils import *
2+
3+
4+
class BoundingBox:
5+
def __init__(self,
6+
imageName,
7+
classId,
8+
x,
9+
y,
10+
w,
11+
h,
12+
typeCoordinates=CoordinatesType.Absolute,
13+
imgSize=None,
14+
bbType=BBType.GroundTruth,
15+
classConfidence=None,
16+
format=BBFormat.XYWH,
17+
bbox_id=None,
18+
column=None):
19+
"""Constructor.
20+
Args:
21+
imageName: String representing the image name.
22+
classId: String value representing class id.
23+
x: Float value representing the X upper-left coordinate of the bounding box.
24+
y: Float value representing the Y upper-left coordinate of the bounding box.
25+
w: Float value representing the width bounding box.
26+
h: Float value representing the height bounding box.
27+
typeCoordinates: (optional) Enum (Relative or Absolute) represents if the bounding box
28+
coordinates (x,y,w,h) are absolute or relative to size of the image. Default:'Absolute'.
29+
imgSize: (optional) 2D vector (width, height)=>(int, int) represents the size of the
30+
image of the bounding box. If typeCoordinates is 'Relative', imgSize is required.
31+
bbType: (optional) Enum (Groundtruth or Detection) identifies if the bounding box
32+
represents a ground truth or a detection. If it is a detection, the classConfidence has
33+
to be informed.
34+
classConfidence: (optional) Float value representing the confidence of the detected
35+
class. If detectionType is Detection, classConfidence needs to be informed.
36+
format: (optional) Enum (BBFormat.XYWH or BBFormat.XYX2Y2) indicating the format of the
37+
coordinates of the bounding boxes. BBFormat.XYWH: <left> <top> <width> <height>
38+
BBFormat.XYX2Y2: <left> <top> <right> <bottom>.
39+
bbox_id: (optional) A unique ID (per image) to show which ground truth bbox a detection
40+
was matched with
41+
"""
42+
self._imageName = imageName
43+
self._typeCoordinates = typeCoordinates
44+
if typeCoordinates == CoordinatesType.Relative and imgSize is None:
45+
raise IOError(
46+
'Parameter \'imgSize\' is required. It is necessary to inform the image size.')
47+
if bbType == BBType.Detected and classConfidence is None:
48+
raise IOError(
49+
'For bbType=\'Detection\', it is necessary to inform the classConfidence value.')
50+
51+
self._classConfidence = classConfidence
52+
self._bbType = bbType
53+
self._classId = classId
54+
self._format = format
55+
self._column = column
56+
57+
# If relative coordinates, convert to absolute values
58+
# For relative coords: (x,y,w,h)=(X_center/img_width , Y_center/img_height)
59+
if (typeCoordinates == CoordinatesType.Relative):
60+
(self._x, self._y, self._w, self._h) = convertToAbsoluteValues(imgSize, (x, y, w, h))
61+
self._width_img = imgSize[0]
62+
self._height_img = imgSize[1]
63+
if format == BBFormat.XYWH:
64+
self._x2 = self._w
65+
self._y2 = self._h
66+
self._w = self._x2 - self._x
67+
self._h = self._y2 - self._y
68+
else:
69+
raise IOError(
70+
'For relative coordinates, the format must be XYWH (x,y,width,height)')
71+
# For absolute coords: (x,y,w,h)=real bb coords
72+
else:
73+
self._x = x
74+
self._y = y
75+
if format == BBFormat.XYWH:
76+
self._w = w
77+
self._h = h
78+
self._x2 = self._x + self._w
79+
self._y2 = self._y + self._h
80+
else: # format == BBFormat.XYX2Y2: <left> <top> <right> <bottom>.
81+
self._x2 = w
82+
self._y2 = h
83+
self._w = self._x2 - self._x
84+
self._h = self._y2 - self._y
85+
if imgSize is None:
86+
self._width_img = None
87+
self._height_img = None
88+
else:
89+
self._width_img = imgSize[0]
90+
self._height_img = imgSize[1]
91+
92+
self._bbox_id = bbox_id
93+
94+
def setAbsoluteBoundingBox(self, x, y, w, h):
95+
self._x = x
96+
self._y = y
97+
self._w = w
98+
self._h = h
99+
self._x2 = self._x + self._w
100+
self._y2 = self._y + self._h
101+
102+
def getAbsoluteBoundingBox(self, format=BBFormat.XYWH):
103+
if format == BBFormat.XYWH:
104+
return (self._x, self._y, self._w, self._h)
105+
elif format == BBFormat.XYX2Y2:
106+
return (self._x, self._y, self._x2, self._y2)
107+
108+
def getRelativeBoundingBox(self, imgSize=None):
109+
if imgSize is None and self._width_img is None and self._height_img is None:
110+
raise IOError(
111+
'Parameter \'imgSize\' is required. It is necessary to inform the image size.')
112+
if imgSize is None:
113+
return convertToRelativeValues((imgSize[0], imgSize[1]),
114+
(self._x, self._y, self._w, self._h))
115+
else:
116+
return convertToRelativeValues((self._width_img, self._height_img),
117+
(self._x, self._y, self._w, self._h))
118+
119+
def getImageName(self):
120+
return self._imageName
121+
122+
def getBboxID(self):
123+
return self._bbox_id
124+
125+
def getColumn(self):
126+
return self._column
127+
128+
def setColumn(self, column):
129+
self._column = column
130+
131+
def setBboxID(self, bbox_id):
132+
self._bbox_id = bbox_id
133+
134+
def getConfidence(self):
135+
return self._classConfidence
136+
137+
def getFormat(self):
138+
return self._format
139+
140+
def getClassId(self):
141+
return self._classId
142+
143+
def setClassId(self, new_class_id):
144+
self._classId = new_class_id
145+
146+
def getImageSize(self):
147+
return (self._width_img, self._height_img)
148+
149+
def getCoordinatesType(self):
150+
return self._typeCoordinates
151+
152+
def getBBType(self):
153+
return self._bbType
154+
155+
@staticmethod
156+
def compare(det1, det2):
157+
det1BB = det1.getAbsoluteBoundingBox()
158+
det1ImgSize = det1.getImageSize()
159+
det2BB = det2.getAbsoluteBoundingBox()
160+
det2ImgSize = det2.getImageSize()
161+
162+
if det1.getClassId() == det2.getClassId() and \
163+
det1.classConfidence == det2.classConfidenc() and \
164+
det1BB[0] == det2BB[0] and \
165+
det1BB[1] == det2BB[1] and \
166+
det1BB[2] == det2BB[2] and \
167+
det1BB[3] == det2BB[3] and \
168+
det1ImgSize[0] == det1ImgSize[0] and \
169+
det2ImgSize[1] == det2ImgSize[1]:
170+
return True
171+
return False
172+
173+
@staticmethod
174+
def clone(boundingBox):
175+
absBB = boundingBox.getAbsoluteBoundingBox(format=BBFormat.XYWH)
176+
newBoundingBox = BoundingBox(
177+
boundingBox.getImageName(),
178+
boundingBox.getClassId(),
179+
absBB[0],
180+
absBB[1],
181+
absBB[2],
182+
absBB[3],
183+
typeCoordinates=boundingBox.getCoordinatesType(),
184+
imgSize=boundingBox.getImageSize(),
185+
bbType=boundingBox.getBBType(),
186+
classConfidence=boundingBox.getConfidence(),
187+
format=BBFormat.XYWH)
188+
return newBoundingBox
189+
190+
def get_union_bbox_xywh(self, other_bbox):
191+
[x0, y0, x1, y1] = self.getAbsoluteBoundingBox(format=BBFormat.XYX2Y2)
192+
[other_x0, other_y0, other_x1, other_y1] = other_bbox.getAbsoluteBoundingBox(format=BBFormat.XYX2Y2)
193+
union_x0 = min(x0, other_x0)
194+
union_y0 = min(y0, other_y0)
195+
union_x1 = max(x1, other_x1)
196+
union_y1 = max(y1, other_y1)
197+
union_w = union_x1 - union_x0
198+
union_h = union_y1 - union_y0
199+
200+
return [union_x0, union_y0, union_w, union_h]
201+
202+
def intersects(self, other_bbox):
203+
boxA = self.getAbsoluteBoundingBox(format=BBFormat.XYX2Y2)
204+
boxB = other_bbox.getAbsoluteBoundingBox(format=BBFormat.XYX2Y2)
205+
206+
if boxA[0] > boxB[2]:
207+
return False # boxA is right of boxB
208+
if boxB[0] > boxA[2]:
209+
return False # boxA is left of boxB
210+
if boxA[3] < boxB[1]:
211+
return False # boxA is above boxB
212+
if boxA[1] > boxB[3]:
213+
return False # boxA is below boxB
214+
return True
215+
216+
def intersectionArea(self, other_bbox):
217+
boxA = self.getAbsoluteBoundingBox(format=BBFormat.XYX2Y2)
218+
boxB = other_bbox.getAbsoluteBoundingBox(format=BBFormat.XYX2Y2)
219+
xA = max(boxA[0], boxB[0])
220+
yA = max(boxA[1], boxB[1])
221+
xB = min(boxA[2], boxB[2])
222+
yB = min(boxA[3], boxB[3])
223+
# intersection area
224+
return (xB - xA) * (yB - yA)
225+
226+
def getArea(self):
227+
w, h, _, _ = self.getAbsoluteBoundingBox(format=BBFormat.XYX2Y2)
228+
return w * h

0 commit comments

Comments
 (0)