A few questions

I am at the point where i am testing my finalcrm pipelet and plugin. Everything works fine when testing in my terminal but having a few issues validating everything in the ui. Firstly, when testing the ‘accounts’ endpoint i get this error at the first stage of data loader. This was working fine before in the ui so i think i might have to reset the count perhaps, how do i do this?

Secondly my labels/keyword label is not showing up in the tags so I can’t verify that my pipelet works in the ui. Where do i go to configure/ show labels in tags?

Hi @filsan.hassan,
Sorry to hear you are having some issues.

In regards to your first question, I believe for the the fake CRM endpoint you will have an error for any count > 50. In order to change this, I would modify your finalcrm dataloader to request batches of 50 or less, then use start to indicate where you want to start from.

I would verify that you are checking if the dataloader is in preview mode when getting data batches and adjusting the count to something small as well.

def getDataBatch(self, batch_size):
# your code etc...
#
#
  if self.preview_mode: # keep preview small for quick for UI
                  batch_size = 1

(if you are assigning the schema programatically: I would also consider modifying your getSchema() method to only request with a count of 1. This way you can avoid long waits when configuring sources in the UI )

In order to change this, I think that the best option is to modify your code and reuplaod them under the same name.

In reagrds to the labels, the first place I would check is SetupDataLabels and verify your labels appear there. You can modify their visibility from that page.

If you do not see your labels (even after page refresh) then an alternative option is to view the item format from your browsers developer tools.

In chrome, navigate to the Explore page and then open the dev tools (on mac: right click and select “inspect”). Within those dev tools navigate to the network tab. Now in the squirro UI select an item thats lower on the page of results, you should see some new activity on the network tab. One of these will be the item as stored by squirro. Select this item from within the developer tools and click on the “preview” tab. Here you can explore the items keywords and other values to verify that the labels have been added.

Heres an example from a test project of mine:

If you see your tags in the correct area('item'{'keywords':{'your_label':['value1',...],....},...}) of the item but not in the “Labels” page. You can add these labels in the “Labels” page by clicking the plus button and adding the label of the correct type (int/str/geo) with the value identical to that in the keywords of your item. Make sure that they are the same value and type.

1 Like

thank you for the reply. I’ve managed to solve the first error as you said it was a case of changing the default batch_size.

as for the second error I used the inspect element method and the keywords were still not showing. But have managed to figure out that it is because this line of my code is not working when the pipelet is run in the ui with the same test_item: “account_ids = item[‘keywords’][‘account_id’]”. This works fine though when tested in the terminal. Any ideas why its is not reading this properly.

Hi @filsan.hassan,
Im glad to hear you sorted out the batch_size issue.

Regarding the keywords, without looking at the code Its hard to diagnose the issue. I would expect that you would see errors in the plumber logs for the pipelet. These can be found in the Server at /var/log/squirro/plumber/plumber.log or in the Squirro UI under: ServerLog Filesplumber.log (documentation). I would first check the plumber log for any obvious errors.

Looking at the sample code you supplied, I would also double check that the “account_id” key is being added to the items keywords in the fakecrm dataloader. If thats not being mapped correctly I would expect the pipelet to not work as intended. You can check your mappings in the dataloader config steps which is accessible by clicking on “edit” on the datasource in question.

Alternatively, you can use the built in logging module to log the steps around where you are assigning keywords to see how your pipelet is acting when deployed.

If you think its easier, you can also try and expand your test items to be more realistic as to what you are ingesting from your dataloader.

Heres a sample item if you want to use variations on it to test your code locally.

{
  "id": "urDkMYVDWO6iL7hyVf_Ixg",
  "link": "https://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2023-42359",
  "starred": false,
  "title": "CVE-2023-42359",
  "body": "<html><body><p>SQL injection vulnerability in Exam Form Submission in PHP with Source Code v.1.0 allows a remote attacker to escalate privileges via the val-username parameter in /index.php.</p></body></html>",
  "created_at": "2023-09-18T12:15:07",
  "keywords": {
    "feed_hostname": [
      "nvd.nist.gov"
    ],
    "source_type": [
      "Feed"
    ],
    "cve_source": [
      "Feed"
    ],
    "nlp_tag__phrases": [
      "sql injection vulnerability",
      "val username parameter",
      "source code v.1.0",
      "remote attacker",
      "exam form submission",
      "php",
      "cve-2023 - 42359",
      "/index.php",
      "cve-2023",
      "privilege",
      "cve-2023-42359"
    ],
    "publish": [
      "yes"
    ]
  },
  "communities": [],
  "sources": [
    {
      "id": "JiQP1WSeSqGQywf-NauCHg",
      "title": "CVE Feed",
      "photo": "/storage/datasource_pictures/24/8c/117748fec7d98b59/8-feed_plugin"
    }
  ],
  "notes": [],
  "references": []
}

If you have any more questions, please feel free to reach out.

Best of luck!
Aaron