Enhance content/asset search in AEM DAM/Azure Storage using FAISS vector database
Use Case:
Search for similar assets in AEM DAM/Azure Storage using the FAISS vector database. For example, searching for images classified as male.
This approach is particularly useful when metadata is unavailable for a given set of assets and duplicate/similar assets needs to be searched.
Overview
- Fetch Images: Retrieve images from the repository.
- Filter Images: Use face detection and gender classification to filter the images.
- Extract Features: Convert the filtered images into feature vectors.
- Index and Search: Index these feature vectors using FAISS and find similar images.
Prerequisites for Local Setup: An AEM instance with assets in DAM, Python, FAISS installed, and necessary Python libraries.
Models Used
- ResNet50: A deep learning model utilized for extracting features from images. The final classification layer is removed to use only the feature extraction part.
Steps
1. Fetch Images from AEM DAM
The script retrieves images from the AEM DAM.
2. Image Processing
- Face Detection: Use Haar cascade classifier, a machine learning object detection method, to detect faces.
- Feature Extraction: Convert images into feature vectors using the ResNet50 model.
- Gender Classification: For demonstration purposes, the gender is hard-coded as 'male'.
3. Indexing and Searching
- FAISS Index: Create an index of feature vectors using the FAISS library for similarity search.
- Querying: Query the index with the feature vector of the first image to find similar images.
4. Output Results
- The script displays the results of the similarity search, showing the nearest images and their distances from the query image.
Configuration in AEM
- Add MIME Types: In the config manager, add content/*:image/png,image/svg+xml to org.apache.sling.security.impl.ContentDispositionFilter.
- Update MIME Types: Modify the above expression and script to match the types of assets in DAM.
Execution
- Run AEM Locally.
- Upload Assets: Ensure assets with faces identified as male are uploaded.
- Update the Script: Configure the DAM path, MIME type, and credentials in the script.
- Set Environment Variable: Open CMD and set KMP_DUPLICATE_LIB_OK=TRUE.
- Run the Script.
You can also create a Flask application to trigger this functionality via a button click or checkbox in AEM.
For details on deploying a Flask web app to Azure App Service, refer to the following guide:
Deploy a Flask Web App to Azure App Service
Output:
Hope you find it useful. Thanks for reading.
Download script here:
https://experienceleaguecommunities.adobe.com/skucn57933/attachments/skucn57933/adobe-experience-manager-discussions/47279/1/aem-dam.zip
Comments
Post a Comment