Delete item in SharePoint dataloader

Hi all,

We are adapting the SharePoint dataloader for a client and they want any items deleted in SharePoint to be correspondingly deleted in Squirro. The dataloader uses a delta API to retrieve new or edited items, including items that have been deleted. Right now the code skips over any files that are deleted:

for file_entry in _file_entries():
            is_file = "file" in file_entry
            deleted = "deleted" in file_entry
            if not is_file or deleted:
                continue

but we would want to get the item’s ID and delete it from Squirro.

Are there any other dataloaders that implement deletion? We could implement deletion by creating a SquirroClient, looking up matching items, and then deleting them, but is there a more standard way to implement this?

Thanks,
Kate

2 Likes

You can do deletions by using per item “flag”, as document here:
https://squirro.atlassian.net/wiki/spaces/DOC/pages/30670944/Data%2BLoader%2BReference
At very least, the filesystem_plugin implements it if you are searching for an example of use.

While I cannot guarantee this patch to be work (since I didn’t have a chance yet to test it), it should give you an idea how it can be done:

diff --git a/sharepoint_plugin/dataloader_plugin.py b/sharepoint_plugin/dataloader_plugin.py
index 9d53c0d..faa91b1 100644
--- a/sharepoint_plugin/dataloader_plugin.py
+++ b/sharepoint_plugin/dataloader_plugin.py
@@ -203,7 +203,13 @@ class __OnedriveSourceBase(DataSource):
         for file_entry in _file_entries():
             is_file = "file" in file_entry
             deleted = "deleted" in file_entry
-            if not is_file or deleted:
+            if not is_file:
+                continue
+            if deleted:
+                batch.append({
+                    "id": file_entry["id"],
+                    "flag": "d",
+                })
                 continue
 
             transformed_entry = self._transform_file_entry(file_entry)
@@ -371,6 +377,15 @@ class __OnedriveSourceBase(DataSource):
                 "type": "str",
                 "advanced": True,
             },
+            {
+                "name": "deletions",
+                "display_label": "Detect file deletions",
+                "help": "If set, file deletions will be detected and the "
+                "corresponding files will be deleted from Squirro",
+                "type": "bool",
+                "action": "store_true",
+                "advanced": True,
+            },
         ]
 
     def getIncrementalColumns(self) -> None:
diff --git a/sharepoint_plugin/mappings.json b/sharepoint_plugin/mappings.json
index 30acb38..3f10b69 100644
--- a/sharepoint_plugin/mappings.json
+++ b/sharepoint_plugin/mappings.json
@@ -6,5 +6,6 @@
     "map_file_mime": "file.mimeType",
     "map_file_data": "content",
     "map_url": "webUrl",
+    "map_flag": "flag",
     "facets_file": "facets.json"
 }

In future versions of this plugin it should have that support out of the box, but I can’t give an ETA on this.

5 Likes

Thanks @maciej.urbanski - we’ve implemented this and it works.

3 Likes