Conversation
|
Huh, did you not need to change the snapshot download function? Interesting. In any case, our processing of the MIP list may be messed up. Look at it here: There are many duplicates in the list, sometimes with different statuses or dates. I am not sure what the meaning of that is and how to correctly handle that, or how we even handle it. For example look at this MIP entry: https://sec-certs.org/fips/mip/entry/BoringCrypto and this MIP snapshot: https://sec-certs.org/fips/mip/699a4159df79b3f2eae196ad It contains BoringCrypto 4 times. I think right now the best course of action is to just ensure we are downloading and storing all the data (and not losing anything). Then we can figure out how these actually correspond to products and certifications. |
No, neither the site nor the table with entries changed except for the new statuses.
Duh, how I could have missed it... You are right, it's the case in the newer snapshots as well. On the page, we are not handling it. Just when listing status changes, only one of the duplicate entries with the same name is selected to compute the days and show the status changes. As you said, we need to figure out the semantics of the duplicates to be able to handle them better. Do you think that entries with the same name are just duplicates, so we could just consider the one with the most recent status date? Or are they semantically different? Any idea how to figure this out? |
|
@J08nY I discovered that on the NIST MIP page each entry in the table has maybe some sort of ID in the vendor column. It seems to be tied to the entry itself, since it differs across entries from the same vendor and it appears stable (at least from a quick look to the wayback machine). Though there is one repeated value We could extract these IDs to differentiate entries with the same name in the list. The only edge case would be if both entries had the same name and both had the They call it submissionID in the function parameter... |
|
Nice, that could work. Could you introduce that into the MIPEntry type and add a parsing function that extracts it as well? Let's make a new I mean, we will not have this for all the past entries, so that is already screwed, but at least going forward we store all the info we can. Perhaps we can then figure out some heuristic based on the "status_since" and the usual progression of statuses to map the MIP entries to some cohesive flow that ends in a module... |
|
Yes, will do tmrw. And it probably also makes sense to handle it on page, though we could wait until we have a few snapshots collected and see how reliable it actually is? |
I don't know how though. We will not have this ID for the vast majority of MIP entries. On the page you need to be able to show two pages, the Snapshot page (which is really a copy of NIST's MIP list from a given date, not providing much more than the wayback machine) and a page for each MIP entry. For the MIP entry page, I assumed the module name to be a unique key and use it in the URL as such. This is clearly insufficient and we need something better. The ids will work on future data, but would need to also change the URL addressing and some more stuff. Idk, I would not prioritize it now. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #565 +/- ##
===========================================
- Coverage 69.93% 57.00% -12.92%
===========================================
Files 78 78
Lines 9126 9137 +11
===========================================
- Hits 6381 5208 -1173
- Misses 2745 3929 +1184 ☔ View full report in Codecov by Sentry. |
|
Please review this version and let me know if you agree with it for now. |
Uh oh!
There was an error while loading. Please reload this page.