Utils object detectors
add_regression_target_to_pos_valid(anchor_boxes_dict)
Add the regression target for positive and valid anchors Otherwise, keep (0, 0, 0, 0) and won't be used by the loss
Parameters:
Name | Type | Description | Default |
---|---|---|---|
anchor_boxes_dict |
dict
|
Anchor boxes dictionary - 'anchor_img_coordinates': Coordinates of the anchor box (input format ie. image space) - 'bboxes': bboxes with their coordinates xyxy (in image space) and iou - 'anchor_type': anchor type (pos, neg or neutral) - 'anchor_validity': anchor validity - 'best_bbox_index': bbox associated to this anchor |
required |
Returns: dict: Updated anchor boxes withe the regression targets
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
apply_regression(coordinates_and_regression)
Applies the result of a regression on an anchor box (or a ROI) given in xyhw format.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
coordinates_and_regression |
ndarray
|
An array composed of 8 objects x_anc, y_anc, h_anc, w_anc, tx, ty, th, tw. (x_anc, y_anc, h_anc, w_anc) are the coordinates of the anchor (or of a ROI) (tx, ty, th, tw) are the predictions of a regression Shape (8,) |
required |
Returns: float: coordinates after regression applied on the anchor box (or on the ROI) - x coordinate of the upper left corner float: coordinates after regression applied on the anchor box (or on the ROI) - y coordinate of the upper left corner float: coordinates after regression applied on the anchor box (or on the ROI) - height float: coordinates after regression applied on the anchor box (or on the ROI) - width
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
calc_regr(coordinates_bbox, coordinates_anchor)
Gives the target of a regression given the coordinates of a bbox and of an anchor (or a ROI)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
coordinates_bbox |
tuple
|
The coordinates of a bbox in opposite points format |
required |
coordinates_anchor |
tuple
|
The coordinates of an anchor (or a ROI) in opposite points format |
required |
Returns: float: Gap between the centers (x coordinate) normalized by the width of the anchor float: Gap between the centers (y coordinate) normalized by the height of the anchor float: Height ratio : bbox / anchor (log version) float: Width ratio : bbox / anchor (log version)
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
check_coordinates_validity(function)
Decorator to make sure that the coordinates are valid
Parameters:
Name | Type | Description | Default |
---|---|---|---|
function |
Callable
|
Function to decorate |
required |
Raises: ValueError: If a set of coordinates is impossible Returns: function: The decorated function
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
complete_at_least_one_anchor_per_bbox(anchor_boxes_dict, bboxes_index_with_no_positive)
Completes the dictionary of anchor to have at least one positive anchor per bbox if it is not already the case. If a bbox is not associated to an anchor, we associate it to the anchor with which it has the biggest iou (if this anchor is not already associated with another bbox) Args: anchor_boxes_dict (dict): Anchor boxes dictionary - 'anchor_img_coordinates': Coordinates of the anchor box (input format ie. image space) - 'bboxes': bboxes with their coordinates xyxy (in image space) and iou - 'anchor_type': anchor type (pos, neg or neutral) - 'anchor_validity': anchor validity - 'best_bbox_index': bbox associated to this anchor bboxes_index_with_no_positive (list): List of bboxes with no positive anchor associated Returns: dict: Updated anchor boxes dictionary
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
create_fake_dict_rois_targets(img_data, subsampling_ratio, nb_rois_per_img)
Creates fake dict_rois_targets in the rare cases where the function limit_rois_targets gives an empty object (None).
Process : we return ROIs on the entire image, considered as background
Parameters:
Name | Type | Description | Default |
---|---|---|---|
img_data |
dict
|
Metadata of the image after the preprocessing. In particular, the size of the image |
required |
subsampling_ratio |
int
|
Subsampling of the base model (shared layers) |
required |
nb_rois_per_img |
int
|
Number of fake ROIs to return |
required |
Returns: dict: The dictionary of fake "selected" ROIs
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
draw_bboxes(input_img, output_path=None, gt_bboxes=None, predicted_bboxes=None)
Adds bboxes to an image (np.ndarray)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_img |
ndarray
|
Input image |
required |
Kwargs: output_path (str): Path to the output file. If None, the result is not saved gt_bboxes (list): List of "ground truth" bboxes to display Each entry must be a dictionary with keys x1, y1, x2, y2 and (optional) class predicted_bboxes (list): List of bboxes coming from a prediction (same format as gt_bboxes) Raises: FileExistsError: If the output file already exists Returns: (np.ndarray) : The image with the boxes
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
draw_bboxes_from_file(input_path, output_path=None, gt_bboxes=None, predicted_bboxes=None)
Adds bboxes to an image from a file
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_path |
str
|
Path to the input image |
required |
Kwargs: output_path (str): Path to the output file. If None, the result is not saved gt_bboxes (list): List of "ground truth" bboxes to display Each entry must be a dictionary with keys x1, y1, x2, y2 and (optional) class predicted_bboxes (list): List of bboxes coming from a prediction (same format as gt_bboxes) Raises: FileNotFoundError: If the input file does not exist Returns: (np.ndarray) : The image with the boxes
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
draw_rectangle_from_bbox(img, bbox, color=None, thickness=None, with_center=False)
Draws a rectangle in the image and adds a text (optional)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
img |
ndarray
|
The considered image |
required |
bbox |
dict
|
The dictionary containing the coordinates and the text |
required |
color |
tuple
|
A RGB tuple giving the color of the rectangle |
None
|
thickness |
int
|
The thickness of the rectangle |
None
|
with_center |
bool
|
If True, also draws the center of the rectangle |
False
|
Raises: ValueError: If one of the keys 'x1', 'y1', 'x2', 'y2' is missing
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
format_classifier_inputs_and_targets(dict_rois_targets, dict_classes, classifier_regr_scaling)
Transforms a dictionary of target ROIs into a suitable format for the classifier model
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dict_rois |
dict
|
Dictionary containing the possible inputs / targets of the classifier |
required |
dict_classes |
dict
|
Mapping of the classes of the model (must not contain 'bg'), format : {idx: label} |
required |
classifier_regr_scaling |
list<float>
|
Regression coefficient to apply to coordinates |
required |
Returns: np.ndarray: coordinates of each selected ROIs # Shape : (1, nb_rois, 4), format x, y, h, w np.ndarray: Classification target of the classifier (with the background) # Shape (1, nb_rois, (nb_classes + 1)) np.ndarray: Two parts array: # Shape (1, nb_rois, 2 * nb_classes * 4) -> First half : identification class ground truth to calculate the regression loss for the classifier # Shape (1, nb_rois, nb_classes * 4) -> Second hald : regression target for the classifier (one regression per class) # Shape (1, nb_rois, nb_classes * 4)
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 |
|
get_all_viable_anchors_boxes(base_anchors, subsampling_ratio, feature_map_height, feature_map_width, im_resized_height, im_resized_width)
Gets a dictionary of 'viable' anchor boxes.
From a list of "base" anchors, we will take each point of a features map, get its initial coordinates (input of the model) and build as many anchors as "base" anchors with this point as a center. Then we filter out the ones which are outside the image
Parameters:
Name | Type | Description | Default |
---|---|---|---|
base_anchors |
list
|
List of base anchors |
required |
subsampling_ratio |
int
|
Subsampling ratio of the shared model |
required |
feature_map_height |
int
|
Height of the features map |
required |
feature_map_width |
int
|
Width of the features map |
required |
im_resized_height |
int
|
Height of the input image (preprocessed, without padding) |
required |
im_resized_width |
int
|
Width of the input image (preprocessed, without padding) |
required |
Returns: dict : set of 'viable' anchor boxes identified by (y, x, index_anchor)
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
get_area_from_xyxy(x1, y1, x2, y2)
Gives the area (absolute, not relative) of a rectangle in opposite points format
Args
x1 (float): x coordinate of the upper left point y1 (float): y coordinate of the upper left point x2 (float): x coordinate of the bottom right point y2 (float): y coordinate of the bottom right point
Returns:
Name | Type | Description |
---|---|---|
float |
float
|
The absolute area of the rectangle |
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
get_classifier_test_inputs(rois_coordinates)
Formats the inputs for the classifier from ROIs proposed by the RPN (test case)
Process : For each ROI, we simply get the format x, y, h, w
Parameters:
Name | Type | Description | Default |
---|---|---|---|
rois_coordinates |
list<np.ndarray>
|
ROIs to transform rois_coordinates must be a list with only one entry : the ROIs of the current image (for prediction, the batch_size is forced to 1) The unique entry is a numpy array: # Shape (nb_rois, 4) # Format x1, y1, x2, y2 |
required |
Raises: ValueError: If the number of elements in the list is different from 1 Returns: np.ndarray : ROIs to use as inputs of the model # Shape : (1, nb_rois, 4), format x, y, h, w
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
get_classifier_train_inputs_and_targets(model, img_data_batch, rois_coordinates)
Gives the regression and classification of the classifier from the ROIs given by the RPN
We got the ROIs from the RPN prediction (and transformed them via get_roi_from_rpn_predictions)
For each image we will : - Calculate the ious between bboxes and ROIs - Keep, for each ROI, the bbox with the biggest iou (if the iou is bigger than a threshold) - Filter the ROIs to only some of them: - Allows to keep OOM in check - We respect to the loss, we will, of course, only take into account the selected ROIs - We will keep a balance between positive ROIs (match with an object) and negative ROIs (match with 'bg') - Format the inputs and targets of the model
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model |
ModelKerasFasterRcnnObjectDetector
|
Model used (contains all the necessary configs) |
required |
img_data_batch |
list<dict>
|
List of img_data for the batch Used here to get the bboxes of the images to define the targets of the classifier |
required |
rois_coordinates |
list<np.ndarray>
|
Final list of the ROIs selected for the classifier part. Each element is a numpy array containing the coordinates of the ROIs calculated for an image of the batch Format x1, y1, x2, y2 (opposite points) |
required |
Returns: np.ndarray : ROIs coordinates in input of the model - Format x, y, h, w # Shape : (batch_size, nb_rois_per_img, 4), format x, y, h, w np.ndarray : Targets of the classifier - classification # Shape (batch_size, nb_rois_per_img, (nb_classes + 1)) np.ndarray : Targets of the classifier - regression # Shape (batch_size, nb_rois_per_img, 2 * nb_classes * 4)
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 |
|
get_feature_map_size(input_height, input_width, subsampling_ratio)
Gives the size of the features map given the height and width of the image using the subsampling_ratio of the shared model. For exemple, for VGG16, the subsampling_ratio is 16
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_height |
int
|
Height of the image |
required |
input_width |
int
|
Width of the image |
required |
subsampling_ratio |
int
|
Subsampling ratio of the shared model |
required |
Raises: ValueError: If incorrect dimension of the image (< 1) ValueError: If the subsampling_ratio is incorrect (< 1) Returns: int: Height of the features map int: Width of the features map
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
get_final_bboxes(final_boxes, img_data)
Resizes the final predicted boxes to image space and formats them.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
final_boxes |
list)
|
list of boxes valid from a probability AND coordinates xyxy points of view and with "no" overlap Format [(cl, proba, coordinates), (...), ...) |
required |
img_data |
dict)
|
Metadata associated with the image (used to resize predictions) |
required |
Returns: A list of bboxes corresponding to the model predictions
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
get_iou(coordinatesA, coordinatesB)
Gives the intersection over union (iou) from the coordinates of two rectangles (in opposite points format)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
coordinatesA |
tuple<float>
|
The coordinates of the first rectangle in the format (x1, y1, x2, y2) |
required |
coordinatesB |
tuple<float>
|
The coordinates of the second rectangle in the format (x1, y1, x2, y2) |
required |
Returns: float: Intersection over union of the two rectangles
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
get_iou_anchors_bboxes(anchor_boxes_dict, image_bboxes)
Gives the iou for each anchor boxes with all the bboxes of a list (for example, all the bboxes of an image)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
anchor_boxes_dict |
dict
|
Anchor boxes dictionary - 'anchor_img_coordinates': xyxy coordinates of the anchor boxe (input format, ie. image space) |
required |
image_bboxes |
list<dict>
|
List of bboxes |
required |
Returns: dict: The input dictionary to which we added a bboxes field containing coordinates and iou
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
get_new_img_size_from_min_side_size(height, width, img_min_side_size=300)
Gets the new dimensions of an image so that the smaller dimension is equal to img_min_side_size but keeping the ratio.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
height |
int
|
Height of the base image |
required |
width |
int
|
Width of the base image |
required |
Kwargs: img_min_side_size (int): Final size of the smaller dimension Raises: ValueError: If incorrect dimension of the image (< 1) ValueError: If img_min_side_size is incorrect (< 1) Returns: int: Resized height int: Resized width
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
get_roi_from_rpn_predictions(model, img_data_batch, rpn_predictions_cls, rpn_predictions_regr)
Converts the output layers of the RPN (classification and regression) in ROIs
We get the prediction results of the RPN and we want to select regions of interest (ROIs) for the
classifier part. For each point and each base anchor, we apply the results of the regression. Then we crop the resulting ROIs in order to stay in the limit of the image. Then we delete the unsuitable ROIs (ie. invalid) and finally we apply a Non Max Suppression (NMS) algorithm to remove the ROIs which overlap too much.
Note : We work with float coordinates. It is no big deal, we will recast them to int to display them.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model |
ModelKerasFasterRcnnObjectDetector
|
Model used (contains all the necessary configs) |
required |
img_data_batch |
list<dict>
|
List of img_data of the batch Here, it is used to get the (preprocessed) size of the images in order to remove the ROIs which are outside the image. Each entry must contain 'resized_height' & 'resized_width' |
required |
rpn_predictions_cls |
ndarray
|
Classification prediction (output RPN) shape: (batch_size, height_feature_map, width_feature_map, nb_anchor) |
required |
rpn_predictions_regr |
ndarray
|
Regression prediction (output RPN) shape: (batch_size, height_feature_map, width_feature_map, 4 * nb_anchor) |
required |
Returns:
list
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 |
|
get_rois_bboxes_iou(rois, img_data, subsampling_ratio)
Gives the ious between the ROIs (in rois) and the bboxes (in img_data).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
rois |
ndarray
|
ROIs given by the RPN (ie. by the function get_roi_from_rpn_predictions()) Shape (N, 4), N corresponds to the number of given ROIs (in general max 300, cf. model.nms_max_boxes) |
required |
img_data |
dict
|
Metadata of the image after the preprocessing. In particular, the bboxes have been resized and rotated if the image has been resized and rotated. We only use the 'bboxes' field |
required |
subsampling_ratio |
int
|
Subsampling of the base model (shared layers) - to apply to bboxes (which are in image space) |
required |
Returns: dict: Dictionary containing all the IOUs betwee, ROIs and bboxes of the image Keys : (index_roi) -> 'coordinates' -> 'x1', 'y1', 'x2', 'y2', 'h', 'w' -> (index_bbox) -> 'coordinates': 'x1', 'y1', 'x2', 'y2' -> 'iou' -> 'class'
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
get_rois_targets(dict_rois, classifier_min_overlap, classifier_max_overlap)
Finds the bbox with the biggest iou with an ROI and associate them. Then associates the class of this bbox to the ROI and, if the iou is sufficiently big, gives the associated regression.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dict_rois |
dict
|
Dictionary containing all the ious between the ROIs and the bboxes of the image |
required |
classifier_min_overlap |
float
|
Minimal threshold to consider a ROI as a target of the classifier (which can still be 'bg') |
required |
classifier_max_overlap |
float
|
Minimal threshold to consider a ROI as matching with a bbox (so with a class which is not 'bg') |
required |
Returns: dict: Dictionary containing the 'viable' ROIs and their classification and regression targets
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
get_rpn_targets(model, img_data_batch)
Gives the classification and regression targets for the RPN
We defined a set of possible anchor boxes (def. 9). For each point of the features map,
we look at the possible anchor boxes. We get back to the input image space and keep only the anchor boxes which are totally included in the image. Then, for each anchor box, we check if it matches with a bbox (via iou) and we define our target : match bbox vs match background and gap between anchor box and bbox for the regression part (only if there is a match on a bbox). We use this process for each image
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model |
ModelKerasFasterRcnnObjectDetector
|
Model used (contains all the necessary configs) |
required |
img_data_batch |
list
|
The list of img_data (dict) for the batch. Each entry is a dictionary with the content of an image (already preprocessed) and associated metadata: - 'img' -> image in the numpy format (h, w, c), preprocessed and ready to be used by the model - 'bboxes' -> (dict) associated bboxes (preprocessed image format) 'x1', 'x2', 'y1', 'y2' - 'original_width' -> Original width of the image - 'original_height' -> Original height of the image - 'resized_width' -> Resized width of the image (ie. smaller dim set to img_min_side_size px (def 300)) - 'resized_height' -> Resized height of the image (ie. smaller dim set to img_min_side_size px (def 300)) - 'batch_width' -> Width of the images in the batch (max width of the batch, we pad the smaller images with zeroes) - 'batch_height' -> Height of the images in the batch (max height of the batch, we pad the smaller images with zeroes) |
required |
Returns: np.ndarray: Classification targets : [y_is_box_valid] + [y_rpn_overlap] for each image with : - y_is_box_valid -> if a box is valid (and thus, should enter in the classification loss) - y_rpn_overlap -> target of the classification ('pos', 'neg' or 'neutral') # Shape (batch_size, feature_map_height, feature_map_width, nb_anchors * 2) np.ndarray: Regression targets : [y_rpn_overlap (repeated x 4)] + [y_rpn_regr] for each image with : - y_rpn_overlap -> if a box is an object (and thus, should enter in the regression loss) repeated to account for the 4 coordinates - y_rpn_regr -> regression targets # Shape (batch_size, feature_map_height, feature_map_width, nb_anchors * 2 * 4)
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 |
|
get_valid_boxes_from_coordinates(input_img, input_rois, fm_boxes_candidates, regr_coordinates, classifier_regr_scaling, subsampling_ratio, dict_classes)
Calculates the coordinates (in image space) after application of the regression of the boxes (in features map space) whose probability is sufficiently high. Then restricts them to the image and keeps only the valid boxes
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_img |
ndarray
|
Resized image (useful to get the dimensions) |
required |
input_rois |
ndarray
|
ROIs given by the RPN |
required |
fm_boxes_candidates |
list
|
The boxes (in features map space) valid with respect to their proba |
required |
regr_coordinates |
ndarray
|
Regression prediction for the boxes |
required |
classifier_regr_scaling |
list
|
Scaling to remove from the regression results |
required |
subsampling_ratio |
int
|
Subsampling of the base model (shared layers) - to apply to bboxes (which are in image space) |
required |
dict_classes |
dict
|
Dictionary of the classes of the model |
required |
Returns: A list of boxes valid from a probability AND coordinates xyxy points of view # Format [(cl, proba, coordinates), (...), ...)
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
get_valid_fm_boxes_from_proba(probas, proba_threshold, bg_index)
Keeps predicted (in features map space) boxes whose probability is above a threshold. Also deletes all the boxes which matched on background
Parameters:
Name | Type | Description | Default |
---|---|---|---|
probas |
ndarray
|
Probabilities of the boxes predicted by the model |
required |
proba_threshold |
float
|
Threshold below which, boxes are eliminated |
required |
Returns: A list of boxes (in features map space) valid from a probability point of view # Format [(index, index_cl, proba), (...), ...)
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
limit_rois_targets(dict_rois_targets, nb_rois_per_img)
Limits the number of input / output for each image in order not to have OOM
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dict_rois |
dict
|
Dictionary containing the possible inputs / targets of the classifier |
required |
nb_rois_per_img |
int
|
Maximal number of ROIs to return for each image In the rare case where there are not enough ROIs, we clone the ROIs in order to have enough If no ROI, we return None. This case is then handled by the function create_fake_dict_rois_targets |
required |
Returns: dict: The dictionary containing the "selected" dictionary
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
non_max_suppression_fast(img_boxes_coordinates, img_boxes_probas, nms_overlap_threshold, nms_max_boxes, img_boxes_classes=None)
Filters boxes in order to limit overlaps on the same object using a list of boxes (ROIs or final predictions) and the probabilities of matching with an object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
img_boxes_coordinates |
ndarray
|
The coordinates of the boxes (in opposite points format) shape: (nb_boxes, 4) |
required |
img_boxes_probas |
ndarray
|
The probabilities associated to the boxes shape: (nb_boxes) |
required |
nms_overlap_threshold |
float
|
The iou value above which we assume that two boxes overlap |
required |
nms_max_boxes |
int
|
The maximal number of boxes that this function can return |
required |
Kwargs: img_boxes_classes (np.ndarray): The classes associated with the boxes (optional) # shape: (nb_boxes) Raises: ValueError: If img_boxes_probas is not the same length as img_boxes_coordinates ValueError: If nms_overlap_threshold <= 0 or > 1 ValueError: If nms_max_boxes < 1 ValueError: If img_boxes_classes is not the same length as img_boxes_coordinates (if != None) Returns: np.ndarray: List of kept boxes # shape: (nb_boxes_kept, 4) np.ndarray: Associated probabilities np.ndarray: Associated classes (if prediction)
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 |
|
non_max_suppression_fast_on_preds(boxes_candidates, nms_overlap_threshold)
Applies the NMS algorithm on the valid predicted boxes to avoid overlaps.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
boxes_candidates |
list
|
Valid predicted boxes Format [(cl, proba, coordinates), (...), ...) |
required |
nms_overlap_threshold |
float
|
Above this threshold for the iou, two boxes are said to be overlapping |
required |
Returns: A list of boxes valid from a probability AND coordinates xyxy points of view and with "no" overlap # Format [(cl, proba, coordinates), (...), ...)
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
restrict_and_convert_roi_boxes(bbox_coordinates)
Resizes the box to have the minimal size and crops it to stay in the features map. Finally, converts it in xyxy coordinates.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
bbox_coordinates |
ndarray
|
An array composed of 6 objects : x_roi, y_roi, h_roi, w_roi, height_img_in_feature_map, width_img_in_feature_map. (x_roi, y_roi, h_roi, w_roi) are the coordinates of a ROI (height_img_in_feature_map, width_img_in_feature_map) sont les tailles avant padding de l'image correspondantes, puis downsampled au format feature map |
required |
Returns: float: Coordinates of the ROI after correction - x coordinate of the upper left point float: Coordinates of the ROI after correction - y coordinate of the upper left point float: Coordinates of the ROI after correction - x coordinate of the bottom right point float: Coordinates of the ROI after correction - y coordinate of the bottom right point
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
restrict_valid_to_n_regions(anchor_boxes_dict, num_regions)
Restricts the number of valid anchor boxes. If there are more positive anchor boxes than hald of num_regions : - we invalidate positive anchors until there are less than num_regions / 2 - Then, we invalidate positive anchors until the number of valid anchors is equal to num_regions
Parameters:
Name | Type | Description | Default |
---|---|---|---|
anchor_boxes_dict |
dict
|
Anchor boxes dictionary - 'anchor_img_coordinates': Coordinates of the anchor box (input format ie. image space) - 'bboxes': bboxes with their coordinates xyxy (in image space) and iou - 'anchor_type': anchor type (pos, neg or neutral) - 'anchor_validity': anchor validity - 'best_bbox_index': bbox associated to this anchor |
required |
num_regions |
int
|
The number of valid anchors we want to consider |
required |
Returns: dict: Updated anchor boxes dictionary
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
select_final_rois(rois_coordinates, rois_probas, roi_nms_overlap_threshold, nms_max_boxes, feature_map_sizes)
Deletes the invalid ROIs and selects some of them to limit the overlaps
Parameters:
Name | Type | Description | Default |
---|---|---|---|
rois_coordinates |
ndarray
|
The set of all selected ROIs for all the images of the batch shape: (batch_size, nb_rois, 4) |
required |
rois_probas |
ndarray
|
The probabilities associated to each selected ROIsfor all the images shape: (batch_size, nb_rois) |
required |
roi_nms_overlap_threshold |
float
|
Above this threshold for the iou, we assume that two ROIs overlap |
required |
nms_max_boxes |
int
|
Maximal number of ROIs that this function can return for each image |
required |
feature_map_sizes |
ndarray
|
Theoretical heights and widths of the features maps - useful if we have no valid ROI anymore. Allows to manage the fact that, in a batch, we padded the images so that they all have the same size # shape: (batch_size, 2) |
required |
Returns:
list
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
set_anchors_type_validity(anchor_boxes_dict, image_bboxes, rpn_min_overlap, rpn_max_overlap)
Defines the type and the validity of each anchor
Type:
- pos -> Match between the anchor and a bbox
- neg -> Match between the anchor and the background
- neutral -> In between the two, won't be used by the model
Validity:
- 1 -> If pos or neg
- 0 -> If neutral
Args:
anchor_boxes_dict (dict): Anchor boxes dictionary
- 'anchor_img_coordinates': Coordinates of the anchor box (input format ie. image space)
- 'bboxes': bboxes with their coordinates xyxy (in image space) and iou
image_bboxes (list
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
xyhw_to_xyxy(x, y, h, w)
Changes a rectangle in the format xyhw (x, y, h, w) to the format xyxy (x1, y1, x2, y2)
Args
x (float): x coordinate of the upper left point y (float): y coordinate of the upper left point h (float): height of the rectangle w (float): width of the rectangle
Returns: float: x coordinate of the upper left point float: y coordinate of the upper left point float: x coordinate of the bottom right point float: y coordinate of the bottom right point
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
xyxy_to_cxcyhw(x1, y1, x2, y2)
Changes a rectangle in the format xyxy (x1, y1, x2, y2) to the format cxcyhw (cx, cy, h, w)
Args
x1 (float): x coordinate of the upper left point y1 (float): y coordinate of the upper left point x2 (float): x coordinate of the bottom right point y2 (float): y coordinate of the bottom right point
Returns: float: x coordinate of the center of the rectangle float: y coordinate of the center of the rectangle float: height of the rectangle float: width of the rectangle
Source code in template_vision/models_training/object_detectors/utils_object_detectors.py
xyxy_to_xyhw(x1, y1, x2, y2)
Changes a rectangle in the format xyxy (x1, y1, x2, y2) to the format xyhw (x, y, h, w)
Args
x1 (float): x coordinate of the upper left point y1 (float): y coordinate of the upper left point x2 (float): x coordinate of the bottom right point y2 (float): y coordinate of the bottom right point
Returns: float: x coordinate of the upper left point float: y coordinate of the upper left point float: height of the rectangle float: width of the rectangle