Skip to main content

2. Uploading and Downloading

2.1. Upload Files to a Collection

You can upload files in a directory to a collection in the MetaLake (which is the same functionality as the 'Upload' feature in the web frontend). Optionally, you can include custom metadata and annotation data. Custom metadata may include attributes or additional information about the file. Annotation data may include labeling data for images. Only one type of content (either image or video) can be uploaded in a single API call.

Note that currently we are supporting following file types for upload: jpeg, jpg, png, mp4, mkv

upload_files_to_collection(path, content_type, collection_name, meta_data_object, meta_data_override, file_meta_data_json_path, file_meta_data_json_path, storage_prefix_path, annotation_data)

Parameters

ParameterData typeDefaultDescription
pathstring-directory or file path (should be an absolute path) - The SDK automatically identifies whether it's a directory or single file based on the given path
content_typestring-“image” for image files, “video” for video files and "other" for all other files
collection_namestring-A name given for the collection. If an existing collection name is given, then files will be added to that collection.
meta_data_object (optional)dictionary{}Custom metadata fields and value pairs. These will be applied to all files that are going to be uploaded.
meta_data_override (optional)booleanFalseIf this flag is True, the metadata of already uploaded files will be overridden, even if the file is skipped during the upload process.
file_meta_data_json_path (optional)stringNoneIf we need to set specific metadata (both meta fields and tags) to each individual file, then we can give the path of the JSON file that contain these metadata. The format of the JSON is given below.
storage_prefix_path (optional)stringNoneBy default, the uploaded file collection will be created in the root directory of the bucket. If you wish to avoid creating folders in the root directory and instead specify a specific directory, you can use this parameter. Ex: 'dir_1/sub_dir_1'
annotation_data (optional)dictionaryNoneContains details about annotation data. Fields within this dictionary are described below.

annotation_data Fields

  • json_data_file_path: (String) The absolute path of the JSON file containing annotation data.
  • operation_unique_id: (String) A unique ID representing the relevant model run or annotation project. This ensures annotations from different sources aren't mixed. Using the same ID for multiple API calls will replace previous annotations with new ones.
  • is_normalized: (Boolean) Set to True if normalized values (instead of actual pixel values) are provided for coordinates and dimensions. If True, conversion occurs at the MetaLake.
  • is_model_run: (Boolean) Indicates the type of annotation: True for machine annotations and False for human annotations.

Returns

The ID of the newly created collection and the corresponding job ID will be returned. The unique name of the file will be returned only if you upload a single file.

{
'is_success': True/False,
'job_id': '<Job Id of the operation>',
'collection_id': '<Id of the uploading collection>',
'unique_name': '<unique name of the file>'
}

JSON format of metadata for individual files

{
"files" : [
{
"file": "<file_name1>",
"metadata": {
"field1": "data1",
"field2": "data2",
"Tags": [
"<tag1>", "<tag2>"
]
}
},
{
"file": "<file_name2>",
"metadata": {
"field1": "data3",
"field3": "data4",
"Tags": [
"<tag2>","<tag3>"
]
}
}
]
}

Example usage:

1. Upload with metadata for whole collection
meta_data_object = {
"Captured Location": "Winnipeg",
"Camera Id": "CAM_0001",
"Tags": [
"#retail"
]
}
upload_res = client.upload_files_to_collection("/home/user/images", "image", "my_collection", meta_data_object)
upload_job_id = upload_res['job_id']

#Waiting for upload processing to complete
client.wait_for_job_complete(upload_job_id)
print('Upload Completed!')
2. Upload with metadata specific to each file
meta_data_object = {
"Captured Location": "Toronto",
"Camera Id": "CAM_0002",
"Tags": [
"#retail"
]
}
metadata_json_path = '/home/user/path/to/json/metadata.json'
upload_res = client.upload_files_to_collection("/home/user/images", "image", "my_collection1", meta_data_object, False, metadata_json_path)

JSON format for annotation data with 'rectangle’ shape type


{
"images": [
{
"image": "image_file_name.jpg",
"annotations": [
{
"type": "rectangle",
"bbox": [
<top1_left_x(number)>,
<top1_left_y(number)>,
<width1(number)>,
<height1(number)>
],
"confidence": 0.53,
"label": "<label_name>",
"metadata": {
"<optional_meta_field1>": "<metadata_value1>",
"<optional_meta_field2>": "<metadata_value2>"
},
"attributes": {
"<optional_attribute_name1>": [
{
"value": "<attribute_value1>",
"confidence": 0.35,
"metadata": {
"<optional_attribute_value1_metadata_field1>": "<attribute_metadata_value3>",
"<optional_attribute_value1_metadata_field2>": "<attribute_metadata_value4>"
}
},
{
"value": "attribute_value2",
"confidence": 0.33,
"metadata": {
}
}
],
"<optional_attribute_name2>": [
{
"value": "<attribute_value3>",
"confidence": 0.23,
"metadata": {

}
}
]
}
}
]
}
]
}

JSON format for annotation data with 'polygon' shape type

{
"images":[
{
"image":"image_file_name.jpg",
"annotations":[
{
"type": "polygon",
"polygon":[
[
<point1_x(number)>,
<point1_y(number)>
],
[
<point2_x(number)>,
<point2_y(number)>
],
[
<pont3_x(number)>,
<point3_y(number)>
]
],
"label":"<label_name>",
"confidence": 0.53,
"metadata":{
"<meta_field_name1>":"<metadata_value1>"
}
}
]
}
]
}

JSON format for annotation data with 'line' shape type

{
"images":[
{
"image":"image_file_name.jpg",
"annotations":[
{
"type": "line",
"line":[
[
<point1_x(number)>,
<point1_y(number)>
],
[
<point2_x(number)>,
<point2_y(number)>
],
[
<pont3_x(number)>,
<point3_y(number)>
]
],
"label":"<label_name>",
"confidence": 0.53,
"metadata":{
"<meta_field_name1>":"<metadata_value1>"
}
}
]
}
]
}

3. Upload with annotation data to each file
meta_data_object = {
"Captured Location": "Toronto",
"Camera Id": "CAM_0002",
"Tags": [
"#retail"
]
}
metadata_json_path = '/home/user/path/to/json/metadata.json'
json_data_file_path = '/home/user/path/to/json/annotation.json'

upload_res = client.upload_files_to_collection(
"/home/user/images",
"image",
"my_collection1",
meta_data_object,
False,
metadata_json_path,
None,
{
"json_data_file_path": json_data_file_path,
"operation_unique_id": "car_human_annotations",
"is_normalized": False,
"is_model_run": False
}
)

2.2. Upload Model Predictions / Annotations to a Collection in DataLake

You can feed the Data Lake with a json file having model run output (machine) or ground truth (human) annotations for frames in a given image collection.

upload_annotations_for_collection(collection_name, operation_unique_id, json_data_file_path, is_normalized, is_model_run)

Note that the correct file name should be set to the ‘image’ field in uploading a json file. The format of the json file depends on the shape type of the annotations.

Limitations

This function is designed to upload annotations exclusively for images uploaded for a collection after the initial extraction of data from storage. Note that it is only compatible with images directly uploaded to the given collection. If images are added to the collection using the 'Add to collection' option, this function will not support them

JSON format for 'rectangle’


{
"images": [
{
"image": "image_file_name.jpg",
"annotations": [
{
"type": "rectangle",
"bbox": [
<top1_left_x(number)>,
<top1_left_y(number)>,
<width1(number)>,
<height1(number)>
],
"confidence": 0.53,
"label": "<label_name>",
"metadata": {
"<optional_meta_field1>": "<metadata_value1>",
"<optional_meta_field2>": "<metadata_value2>"
},
"attributes": {
"<optional_attribute_name1>": [
{
"value": "<attribute_value1>",
"confidence": 0.35,
"metadata": {
"<optional_attribute_value1_metadata_field1>": "<attribute_metadata_value3>",
"<optional_attribute_value1_metadata_field2>": "<attribute_metadata_value4>"
}
},
{
"value": "attribute_value2",
"confidence": 0.33,
"metadata": {
}
}
],
"<optional_attribute_name2>": [
{
"value": "<attribute_value3>",
"confidence": 0.23,
"metadata": {

}
}
]
}
}
]
}
]
}

JSON format for ‘polygon’

{
"images":[
{
"image":"image_file_name.jpg",
"annotations":[
{
"type": "polygon",
"polygon":[
[
<point1_x(number)>,
<point1_y(number)>
],
[
<point2_x(number)>,
<point2_y(number)>
],
[
<pont3_x(number)>,
<point3_y(number)>
]
],
"label":"<label_name>",
"confidence": 0.53,
"metadata":{
"<meta_field_name1>":"<metadata_value1>"
}
}
]
}
]
}

JSON format for ‘line'

{
"images":[
{
"image":"image_file_name.jpg",
"annotations":[
{
"type": "line",
"line":[
[
<point1_x(number)>,
<point1_y(number)>
],
[
<point2_x(number)>,
<point2_y(number)>
],
[
<pont3_x(number)>,
<point3_y(number)>
]
],
"label":"<label_name>",
"confidence": 0.53,
"metadata":{
"<meta_field_name1>":"<metadata_value1>"
}
}
]
}
]
}

Parameters

ParameterData typeDefaultDescription
collection_namestring-Name of the existing image collection
operation_unique_idstring-The ID of the relevant model run or annotation project. This is a unique identifier that is used to distinguish between different sets of annotations. This ID is important in both human and machine annotations because it ensures that annotations from different sources are not mixed up. If the same ID is used for multiple API calls, the previous annotations will be replaced by the new ones. However, if a different ID is used, the new annotations will be added to the MetaLake.
json_data_file_pathstring-Absolute path of the json file having annotation data
Is_normalizedboolean-True if normalized values for coordinates and dimensions are provided instead of real pixel values in the image. If this is True, conversion will happen at the MetaLake backend.
is_model_runboolean-True if this is machine annotations, False if this is human annotations

Example usage

client.upload_annotations_for_collection('my_collection', 'yolov5.0.1', '/my/file/path/file.json', False, True)

2.3. Upload Model Predictions / Annotations to Images in a Storage Path

This function uploads annotation data in the same manner as "upload_annoations_for_collection" but targets images at a specified path within storage (e.g., a folder path inside an AWS S3 bucket). It is particularly useful for handling files retrieved during initial system crawling or data import from storage.

upload_annotations_by_storage_path(operation_unique_id, json_data_file_path, is_normalized, is_model_run, bucket_name)

Note that for the ‘image’ field in uploading a json file, the correct path in the storage. Eg: If the image is in folder /folder/subfolder, then the 'image' should be '/folder/subfolder/image_name.jpg'.

Parameters

ParameterData typeDefaultDescription
operation_unique_idstring-The ID of the relevant model run or annotation project
json_data_file_pathstring-Absolute path of the json file having annotation data
Is_normalizedboolean-True if normalized values for coordinates and dimensions are provided instead of real pixel values in the image. If this is True, conversion will happen at the Data Lake backend.
is_model_runboolean-True if this is machine annotations, False if this is human annotations
bucket_namestringNoneThe name of the bucket which images are located. If this not given, then the default bucket is assumed.

Example usage

client.upload_annotations_by_storage_path(“yolov5.0.1,/my/file/path/file.json’, False, True, 'img_bucket_2')

JSON format example

{
"images": [
{
"image": "/path/in/bucket/image_file_name.jpg",
"annotations": [
{
"type": "rectangle",
"bbox": [
358`,
239,
45,
16
],
"confidence": 0.53,
"label": "<label_name>",
"metadata": {

}
}
}
]
}

2.4. Upload Model Predictions / Annotations to Images by Unique Name

This function uploads annotation data in the same way as 'upload_annotations_for_collection', but the images are referenced by the 'Unique Name', a metadata attribute generated by the DataLake. It is specifically useful for handling files contained in a virtual collection, where the 'upload_annotations_for_collection' function is not applicable.

upload_annotations_by_unique_name(operation_unique_id, json_data_file_path, is_normalized, is_model_run)

Note that when uploading a JSON file, the 'Unique Name' of the relevant image should be specified in the ‘image’ field. If you download the files from DataLake, the file name will be set to match the 'Unique Name'

Parameters

ParameterData typeDefaultDescription
operation_unique_idstring-The ID of the relevant model run or annotation project
json_data_file_pathstring-Absolute path of the json file having annotation data
Is_normalizedboolean-True if normalized values for coordinates and dimensions are provided instead of real pixel values in the image. If this is True, conversion will happen at the Data Lake backend.
is_model_runboolean-True if this is machine annotations, False if this is human annotations

Example usage

client.upload_annotations_by_unique_name(“yolov5.0.1,/my/file/path/file.json’, False, True)

JSON format example

{
"images": [
{
"image": "collection-name_image.jpg",
"annotations": [
{
"type": "rectangle",
"bbox": [
330,
102,
20,
32
],
"confidence": 0.53,
"label": "<label_name>",
"metadata": {

}
}
}
]
}

2.5. Upload Model Predictions / Annotations to Images by file upload job id

You can feed the Data Lake with a json file having model run output (machine) or ground truth (human) annotations for frames associated with a specific file upload job.

upload_annotations_by_job_id(job_id, operation_unique_id, json_data_file_path, is_normalized, is_model_run)

Note that the correct file name should be set to the ‘image’ field in uploading a json file. The format of the json file depends on the shape type of the annotations.

JSON format for 'rectangle’


{
"images": [
{
"image": "image_file_name.jpg",
"annotations": [
{
"type": "rectangle",
"bbox": [
<top1_left_x(number)>,
<top1_left_y(number)>,
<width1(number)>,
<height1(number)>
],
"confidence": 0.53,
"label": "<label_name>",
"metadata": {
"<optional_meta_field1>": "<metadata_value1>",
"<optional_meta_field2>": "<metadata_value2>"
},
"attributes": {
"<optional_attribute_name1>": [
{
"value": "<attribute_value1>",
"confidence": 0.35,
"metadata": {
"<optional_attribute_value1_metadata_field1>": "<attribute_metadata_value3>",
"<optional_attribute_value1_metadata_field2>": "<attribute_metadata_value4>"
}
},
{
"value": "attribute_value2",
"confidence": 0.33,
"metadata": {
}
}
],
"<optional_attribute_name2>": [
{
"value": "<attribute_value3>",
"confidence": 0.23,
"metadata": {

}
}
]
}
}
]
}
]
}

JSON format for ‘polygon’

{
"images":[
{
"image":"image_file_name.jpg",
"annotations":[
{
"type": "polygon",
"polygon":[
[
<point1_x(number)>,
<point1_y(number)>
],
[
<point2_x(number)>,
<point2_y(number)>
],
[
<pont3_x(number)>,
<point3_y(number)>
]
],
"label":"<label_name>",
"confidence": 0.53,
"metadata":{
"<meta_field_name1>":"<metadata_value1>"
}
}
]
}
]
}

JSON format for ‘line'

{
"images":[
{
"image":"image_file_name.jpg",
"annotations":[
{
"type": "line",
"line":[
[
<point1_x(number)>,
<point1_y(number)>
],
[
<point2_x(number)>,
<point2_y(number)>
],
[
<pont3_x(number)>,
<point3_y(number)>
]
],
"label":"<label_name>",
"confidence": 0.53,
"metadata":{
"<meta_field_name1>":"<metadata_value1>"
}
}
]
}
]
}

Parameters

ParameterData typeDefaultDescription
job_idstring-The id for the MetaLake job associated with the file upload
operation_unique_idstring-The ID of the relevant model run or annotation project. This is a unique identifier that is used to distinguish between different sets of annotations. This ID is important in both human and machine annotations because it ensures that annotations from different sources are not mixed up. If the same ID is used for multiple API calls, the previous annotations will be replaced by the new ones. However, if a different ID is used, the new annotations will be added to the MetaLake.
json_data_file_pathstring-Absolute path of the json file having annotation data
Is_normalizedboolean-True if normalized values for coordinates and dimensions are provided instead of real pixel values in the image. If this is True, conversion will happen at the MetaLake backend.
is_model_runboolean-True if this is machine annotations, False if this is human annotations

Example usage

client.upload_annotations_by_job_id('654c5167a467d18f9fdb767c', 'yolov5.0.1', '/my/file/path/file.json', False, True)

2.6. Upload Model Predictions / Annotations -- Deprecated

You can feed the Data Lake with a json file having model run output (machine) or ground truth (human) annotations for frames in a given image collection.

upload_annoations_for_folder(collection_name, operation_unique_id, json_data_file_path, shape_type, is_normalized, is_model_run, destination_project_id)

Note that the correct file name should be set to the ‘image’ field in uploading a json file. The format of the json file depends on the shape type of the annotations.

⚠️ Deprecation Warning

The function upload_annoations_for_folder is deprecated and will be removed in a future version.

We recommend transitioning to one of the following functions based on your needs:

  1. upload_annotations_for_collection
  2. upload_annotations_by_storage_path
  3. upload_annotations_by_unique_name
  4. upload_annotations_by_job_id

Please update your existing code to these new functions to avoid potential issues.

JSON format for 'rectangle’

{
"images":[
{
"image":"<image_filename>",
"annotations":[
{
"bbox":[
<top1_left_x(number)>,
<top1_left_y(number)>,
<width1(number)>,
<height1(number)>
],
"label":"<label_name>",
"metadata":{
"<attribute_name1>":"<attribute_value1>",
"<attribute_name2>":"<attribute_value2>"
},
"confidence":<confidence_value(number)>
}
]
}
]
}

JSON format for ‘polygon’ and ‘line’

{
"images":[
{
"image":"000000397133.jpg",
"annotations":[
{
"polygon":[
[
<point1_x(number)>,
<point1_y(number)>
],
[
<point2_x(number)>,
<point2_y(number)>
],
[
<pont3_x(number)>,
<point3_y(number)>
]
],
"label":"<label1>",
"metadata":{
"<attribute1_name>":"<attribute1_value>"
},
"confidence":<confidence_value(number)>
}
]
}
]
}

Parameters

ParameterData typeDefaultDescription
collection_namestring-Name of the existing image collection
operation_unique_idstring-This is a unique identifier that is used to distinguish between different sets of annotations. This ID is important in both human and machine annotations because it ensures that annotations from different sources are not mixed up. If the same ID is used for multiple API calls, the previous annotations will be replaced by the new ones. However, if a different ID is used, the new annotations will be added to the Data Lake.
json_data_file_pathstring-Absolute path of the json file having annotation data
shape_typestring-Type of the annotations - can be rectangle, polygon, or line.
Is_normalizedboolean-True if normalized values for coordinates and dimensions are provided instead of real pixel values in the image. If this is True, conversion will happen at the Data Lake backend.
is_model_runboolean-True if this is machine annotations, False if this is human annotations
destination_project_id(optional)stringNoneOnly applicable when "is_model_run" is True. If this is given then annotations are attached to the given studio project - this can be used for attaching auto annotations to studio projects

Example usage

1. For a model run
client.upload_annoations_for_folder(‘my_collection’, “yolov5.0.1,/my/file/path/file.json’, ‘polygon’, False, True)
2. For a human annotation (Upload annotations to an existing project)
client.upload_annoations_for_folder(‘my_collection’,<annotation_project_id>,/my/file/path/file.json’, ‘polygon’, False, False)

2.7. Download a Collection with Annotations

This function can be used for downloading annotation data for a list of any human or machine annotation operations from a given image collection. It will dump the annotations as JSON format - the same format we use for uploading annotations data and images.

download_collection(collection_id, annotation_type, operation_id_list, custom_download_path, is_media_include)

Parameters

ParameterData typeDefaultDescription
collection_idstring-The image collection id in the Data Lake
annotation_type(Optional)stringallType of annotation to download - available values are: 'human', 'machine' or 'all'. Note that this is applicable only when operation_id_list is not given or empty.
operation_id_list(Optional)string[]List of required annotation operation ids - This can be project id in case of human annotations or model id in case of machine annotations.
custom_download_path (Optional)stringemptyIf this is given then, the images are downloaded to this location, otherwise it’s downloaded to a directory within the current directory. Note that this requires the absolute path.
is_media_include (Optional)booleanTrueIf the value of this field is set to True, the system will download both the annotation data and the associated media files. If the value is set to False, only the annotation data will be downloaded, and the media files will be skipped.

Returns

The function creates a new directory with a specific name and saves a JSON file inside it containing annotations for the collection for all required annotation operations. Next, it downloads specific frames related to the collection and saves them in a directory called "data" within the newly created directory.

Example usage

client.download_collection("<collection_id>", "human", ["project1_id", "project2_id"], "/my/custom/path")

2.8. Download Annotations of a Collection -- Deprecated

We can download annotation data from a given image collection. It will dump the annotations as JSON format - the same format we use for uploading annotations data and images. You need to supply the collection id which can be viewed from metadata inside the collection in the Data Lake frontend.

download_annotations(collection_id, model_id)

⚠️ Deprecation Warning

This function, download_annotations(collection_id, model_id), is deprecated and will be removed in a future version. Please use the download_collection function instead for future developments and consider updating existing code to avoid issues when this function is eventually removed.

Parameters

ParameterData typeDefaultDescription
collection_idstring-The image collection id in the Data Lake
model_idstring/None-If this is present the system fetches the annotations belonging to that model run, otherwise (if None) the ground truth data will be fetched instead.

Returns

The function creates a new directory with a specific name and saves a JSON file inside it containing annotations for a collection of data. Next, it downloads specific frames related to the collection and saves them in a directory called "data" within the newly created directory.

Example usage

client.download_annotations(“63579fa0f7eb5e0e62d4705”, None)

2.9. Get Downloadable Url for a File

This function retrieves the URL of any file within the Data Lake, enabling its download.

get_downloadable_url(file_key)

Parameters

ParameterData typeDefaultDescription
file_keystring-File unique name in the MetaLake

Returns

A signed URL is provided, enabling direct downloading of the corresponding file from storage. Please be aware that this URL has a limited lifespan, and it is crucial that your application does not reuse it.

Example usage

client.get_downloadable_url("my_file_unique_name_1.jpeg")

2.10. Trash Items from a Collection

With this SDK function, all or a subset of items in a given collection can be moved to the trash.

trash_objects_from_collection(collection_id, query, filter)

Parameters

ParameterData typeDefaultDescription
collection_idstring-Collection ID
query (Optional)string-The search query that filters the items in the collection (This is the same query format that we use in the Data Lake frontend )
filter (Optional)object-Additional criteria, such as annotation type and uploaded date range, can be specified in the filter object as shown here:\n { “annotation_types”: [“<comma separated list of types out of: “raw”, “human” and “machine”>], “from_date”: “\<start date string>, “to_date”: \<end date string> }

Returns

{
'message': '[success_count] objects successfully trashed, [failed_count] objects failed to trash',
'isSuccess': 'True or False'
}

2.11. Trash Items from DataLake

Given items in DataLake can be trashed without specifying a collection, choosing a set of items using a query string and filters.

trash_objects_from_datalake(datalake_query, datalake_filter, content_type)

Parameters

ParameterData typeDefaultDescription
datalake_query (Optional)string-The search query that filters items in the collection. This is the same query format that we use in the Data Lake frontend.
datalake_filter (Optional)object-Additional criteria, such as annotation type and uploaded date range, can be specified as shown below \n{ “annotation_types”: [“<comma separated list of types out of: “raw”, “human” and “machine”>], “from_date”: “\<start date string>, “to_date”: \<end date string>}
content_typestring"image"Type of items that needs to be trashed: “image” or “video”

Returns

{
'message': '[success_count] objects successfully trashed, [failed_count] objects failed to trash',
'isSuccess': 'True or False'
}

2.13. Update metadata for a collection

This function can be used for updating metadata for a given collection. By default, the metadata will be applied for all the files under that collection too.

upload_metadata_for_collection(collection_name, content_type, metadata_obj, is_apply_to_all_files)

Parameters

ParameterData typeDefaultDescription
collection_namestring-Name of the collection in the MetaLake
content_typestring-Type of files in the collection. "image" for image files, "video" for video files and "other" for all other
metadata_objdictionary{}Custom metadata field and value pairs to be applied to the collection.
is_apply_to_all_files (Optional)booleanTrueIf this is False, then given metadata will be applied only to the collection head

Returns

{
'message': 'Error message if there is any',
'isSuccess': 'True or False'
}

Example usage

client.upload_metadata_for_collection("my_collection", "image", {
"Captured Location": "Winnipeg",
"Camera Id": "CAM_0001",
"Tags": [
"#retail"
]
})

2.14. Update metadata for files of a specific file upload

This function is used to update metadata for all or a set of files associated with a specific file upload job.

upload_metadata_by_job_id(job_id, json_data_file_path )

Note that the correct file name should be set to the 'file' field in uploading a json file.

JSON format for metadata file

{
"files": [
{
"file": "dog_1.jpeg",
"metadata": {
"field1": "data1",
"field2": "data2",
"Tags": ["retail", "night", "skip"]
}
},
{
"file": "dog_2.jpeg",
"metadata": {
"field1": "data2",
"Tags": ["night"]
}
},
{
"file": "flower3.jpg",
"metadata": {
"fileSize": 30
}
},
{
"file": "Dog_3.jpeg",
"metadata": {
"field3": "data4",
"field1": "data3",
"Tags": ["morning", "testing", "skip"]
}
}
]
}

Parameters

ParameterData typeDefaultDescription
job_idstring-The id for the MetaLake job associated with the file upload
file_meta_data_json_pathstring-Path to the JSON file containing metadata for the each file in the collection.

Returns

{
'message': 'Error message if there is any',
'isSuccess': 'True or False'
}

Example usage

client.upload_metadata_by_job_id("654c5167a467d18f9fdb767c", "/path/to/my/file_metadata.json")

2.15. Update metadata for files in a Storage Path

This function updates metadata by targeting images at a specified path within storage (e.g., a file path inside an AWS S3 bucket). It is particularly useful for handling files retrieved during initial system crawling or data import from storage.

upload_metadata_by_storage_path(json_data_file_path, bucket_name)

Note that for the 'file' field in json file should be the correct file path of the file in the storage. Eg: If an image is in folder /folder/subfolder, then the 'file' should be 'folder/subfolder/image_name.jpg'.

JSON format for metadata file

{
"files": [
{
"file": "folder/subfolder/image_name_1.jpeg",
"metadata": {
"field1": "data1",
"field2": "data2",
"Tags": ["retail", "night", "skip"]
}
},
{
"file": "folder/subfolder/image_name_2.jpeg",
"metadata": {
"field3": "data4",
"field1": "data3",
"Tags": ["morning", "testing", "skip"]
}
}
]
}

Parameters

ParameterData typeDefaultDescription
file_meta_data_json_pathstring-Path to the JSON file containing metadata for the each file in the collection.
bucket_name (optional)stringNoneThe name of the bucket which images are located. If this not given, then the default bucket is assumed.

Returns

{
'message': 'Error message if there is any',
'isSuccess': 'True or False'
}

Example usage

client.upload_metadata_by_storage_path("654c5167a467d18f9fdb767c", "/path/to/my/file_metadata.json")

2.16. Update metadata for files by unique name

Uploads metadata for a file to the MetaLake using the file's 'Unique Name' as a reference. The 'Unique Name' is a unique identifier for each file, generated by the MetaLake.

upload_metadata_by_unique_name(json_data_file_path)

Note that when uploading a JSON file, the 'Unique Name' of the relevant image should be specified in the 'file' field. If you download the files from DataLake, the file name will be set to match the 'Unique Name'

JSON format for metadata file

{
"files": [
{
"file": "collection_image_name_1.jpeg",
"metadata": {
"field1": "data1",
"field2": "data2",
"Tags": ["retail", "night", "skip"]
}
},
{
"file": "collection_image_name_2.jpeg",
"metadata": {
"field3": "data4",
"field1": "data3",
"Tags": ["morning", "testing", "skip"]
}
}
]
}

Parameters

ParameterData typeDefaultDescription
file_meta_data_json_pathstring-Path to the JSON file containing metadata for the each file in the collection.

Returns

{
'message': 'Error message if there is any',
'isSuccess': 'True or False'
}

Example usage

client.upload_metadata_by_storage_path("654c5167a467d18f9fdb767c", "/path/to/my/file_metadata.json")

2.17. Update metadata for given set of individual files in a collection

This function is used to update metadata for all or a set of files in a given collection.

upload_metadata_for_files(collection_name, content_type, file_meta_data_json_path )

Note that the correct file name should be set to the 'file' field in uploading a json file.

Limitations

This function is designed to update metadata exclusively for files uploaded for a collection after the initial extraction of data from storage. Note that it is only compatible with files directly uploaded to the given collection. If files are added to the collection using the 'Add to collection' option, this function will not support them.

JSON format for metadata file

{
"files": [
{
"file": "dog_1.jpeg",
"metadata": {
"field1": "data1",
"field2": "data2",
"Tags": ["retail", "night", "skip"]
}
},
{
"file": "dog_2.jpeg",
"metadata": {
"field1": "data2",
"Tags": ["night"]
}
},
{
"file": "flower3.jpg",
"metadata": {
"fileSize": 30
}
},
{
"file": "Dog_3.jpeg",
"metadata": {
"field3": "data4",
"field1": "data3",
"Tags": ["morning", "testing", "skip"]
}
}
]
}

Parameters

ParameterData typeDefaultDescription
collection_namestring-Name of the collection in the MetaLake
content_typestring-Type of files in the collection. "image" for image files, "video" for video files and "other" for all other
file_meta_data_json_pathstring-Path to the JSON file containing metadata for the each file in the collection.

Returns

{
'message': 'Error message if there is any',
'isSuccess': 'True or False'
}

Example usage

client.upload_metadata_for_files("my_collection", "image",  "/path/to/my/file_metadata.json")

2.18. Upload Files -- Deprecated

You can upload a single file or files in a directory to the Data Lake with custom metadata. Only one type of content (either image or video) can be uploaded in a single API call.

file_upload(path, collection_type, collection_name, meta_data_object, override)

Parameters

ParameterData typeDefaultDescription
pathstring-directory or file path (should be an absolute path) - the SDK automatically identifies whether its a directory or single file based on path
content_typeinteger-5 for image 4 for video
collection_namestring-A name given for collection, if an existing collection name is given, then files will be added to that collection.
meta_data_objectdictionary-custom metadata field and value pairs
overrideboolean-If the value is set to True, the new file will override the existing file with the same name. Otherwise, the upload process will skip files with the same name.

Example usage

meta_data_object = {
"Captured Location": "Winnipeg",
"Camera Id": "CAM_0001",
"Tags": [
"#retail"
]
}
client.file_upload(/home/user/images, 5, “my_collection”, meta_data_object)

2.19. Download files from MetaLake

Download files from MetaLake based on specified criteria and pagination details.

client.download_files_from_metalake(
item_type,
custom_download_path,
page_index,
page_size,
query,
filter,
sort_order
)

Parameters

ParameterData typeDefaultDescription
item_typestring'image'One of the following: "image", "video" or "other".
custom_download_pathstring-If this is given then, the images are downloaded to this location, otherwise it’s downloaded to a directory within the current directory. Note that this requires the absolute path.
page_index (Optional)integer0The index of the page, starting from 0.
page_size (Optional)integer20The size of the page. The maximum allowed value is 1000.
query (Optional)string-The search query that filters items in the MetaLake. This is the same query format that we use in the MetaLake frontend.
filter (Optional)object-Additional criteria, such as annotation type and uploaded date range, can be specified as shown below \n{ “annotation_types”: [“<comma separated list of types out of: “raw”, “human” and “machine”>], “from_date”: “\<start date string>, “to_date”: \<end date string>}
sort_order (Optional)dictionary{ "sort_by_field": "date_modified", "sort_order": "DESC" }Specifies the sorting order of the returned items. Format: { "sort_by_field": "date_modified"/"date_created"/"name"/"size"/"video_index", "sort_order": "ASC"/"DESC" }.

Returns

A dictionary with a boolean success flag and an optional message.

{
"is_success":True/False,
"message":<optinal_message>
}

if no more pages to download

{
"is_success": False,
"message": "No more pages to download"
}

Example Usage

client.download_files_from_metalake(
"image",
"/home/download_folder",
0,
20
"annotation.label=Bird",
{
"annotation_types": ["human", "machine"],
"from_date": "2022-08-02",
"to_date": "2023-01-19",
},
{"sort_by_field": "date_created", "sort_order": "ASC"}

)

2.20. Download files from a MetaLake collection

Download files from a MetaLake collection based on specified criteria and pagination details.

client.download_files_from_collection(
collection_id,
custom_download_path,
page_index,
page_size,
query,
filter,
sort_order
)

Parameters

ParameterData typeDefaultDescription
collection_idstring-The ID of the collection
custom_download_pathstring-If this is given then, the images are downloaded to this location, otherwise it’s downloaded to a directory within the current directory. Note that this requires the absolute path.
page_index (Optional)integer0The index of the page, starting from 0.
page_size (Optional)integer20The size of the page. The maximum allowed value is 1000.
query (Optional)string-The search query that filters items in the MetaLake. This is the same query format that we use in the MetaLake frontend.
filter (Optional)object-Additional criteria, such as annotation type and uploaded date range, can be specified as shown below \n{ “annotation_types”: [“<comma separated list of types out of: “raw”, “human” and “machine”>], “from_date”: “\<start date string>, “to_date”: \<end date string>}
sort_order (Optional)dictionary{ "sort_by_field": "date_modified", "sort_order": "DESC" }Specifies the sorting order of the returned items. Format: { "sort_by_field": "date_modified"/"date_created"/"name"/"size"/"video_index", "sort_order": "ASC"/"DESC" }.

Returns

A dictionary with a boolean success flag and an optional message.

{
"is_success":True/False,
"message":<optinal_message>
}

if no more pages to download

{
"is_success": False,
"message": "No more pages to download"
}

Example Usage

client.download_files_from_collection(
"65004ce4365f0510adb2f649",
"/home/download_folder",
0,
20,
"annotation.label=Bird",
{
"annotation_types": ["human", "machine"],
"from_date": "2022-08-02",
"to_date": "2023-01-19",
},
{"sort_by_field": "date_created", "sort_order": "ASC"}
)