Inventory Device Matching in IT Visibility
IT Visibility’s normalization process uses certain identifying data fields from each device to recognize when the same device has been collected more than once. Deduplication is achieved not by selecting a primary source inventory record and removing any duplicates, but by merging the information from the various source inventory records into a single normalized device record for each unique device.
On this page, you can find the following information about device deduplication in IT Visibility:
• | Which Identifying Data Fields Are Used for Device Deduplication |
• | What Matching Rules Are Used for Device Deduplication |
• | How the Properties of Duplicate Records are Merged |
• | How the Software Installed on Duplicate Devices is Handled |
Which Identifying Data Fields Are Used for Device Deduplication
The following hardware data fields are used for deduplication in IT Visibility:
• | Serial number |
• | Computer name |
• | Domain name |
• | BIOS UUID |
The following additional data fields can also be used to decide which matching rules to use:
• | Operating system |
• | Connection ID |
What Matching Rules Are Used for Device Deduplication
Each matching rule involves one or more data fields. A rule is also subject to additional conditions based on the value of the relevant fields. If two devices being compared have the same value in each relevant field, and all the additional conditions are also met, then the records for these devices are considered duplicates to be merged.
The current set of matching rules contains the following rules, with all rules applied simultaneously:
• | Case-insensitive equality testing on the Computer name and Serial number fields, if serial number exists in both device records. |
• | Case-insensitive equality testing on the Computer name and Domain name fields, if no non-empty serial number exists or a unique non-empty serial number exists in any device record with that name and domain. |
Note:If two or more devices share the same name and domain but have different non-empty serial numbers, none of those devices are matched. Only devices with that name and domain and without a serial number are matched.
• | Case-insensitive equality testing on the BIOS UUID field, if the Operating system field of device identifies it as VMware servers only. |
• | A rule based on the Computer name field and the UUID (including the BIOS UUID field and the UUID extracted from the Serial number field), which performs the following sequential steps: |
Note:This rule is for the cases where in certain source inventory systems the UUID is only available as part of the serial number.
1. | The Serial number field is inspected to check whether it contains a valid UUID. If so, this UUID value is used for matching, otherwise the value form the BIOS UUID field is used. |
2. | The Computer name fields are compared by the case-insensitive equality testing. And the UUID values are compared byte-order-insensitively. |
Tip:Two UUID values are considered to be equal if they match exactly or if they match after one of them is byte-swapped to convert to the other endianness.
• | A rule based on the Connection ID and Serial number fields, if the Operating System field of the device identifies it as a non-VMware server only. |
Note:This rule has a limit of two matching devices. If more than two devices share a particular connection and serial number combination, none of the devices in that set are matched.
How the Properties of Duplicate Records are Merged
After a set of hardware inventory records are identified as duplicates by the matching rules, IT Visibility builds a merged/normalized record from this set of raw/source inventory records by selecting each field value from one of the corresponding source records. Different source records may contribute different fields for the same merged record.
All the property field values are taken from the matched duplicate records in the order of inventory date, most recent first. If a particular data source does not contain a value for that field, or the value is empty, the value is taken from the corresponding field in the next most recent source inventory record.
How the Software Installed on Duplicate Devices is Handled
For software and operating system inventory, the list of software installed on the merged record is the combination of all inventory sources, in other words, the union of all the software recognized in each of the inventory sources.
Software is recognized by mapping evidence, including installer evidence and file evidence, to Technopedia IDs. Evidence from different sources that maps to the same software title/product/version ID is deduplicated and results in a single normalized installation. In particular, if the same software application is recognized by both installer evidence and file evidence, whether on the same device or on duplicate devices, it also results in a single normalized installation on the normalized device. All the evidence that has contributed to this normalized installation, including the type of evidence and which source devices it was found on, is retained and linked to the installation for transparency.