Why are there so many identical or nonexistent assets in my inventory?

Modified on Mon, 25 Sep 2023 at 03:05 PM

There are a number of possible causes of apparent duplicate assets in your runZero inventory. This document describes a few of them, with suggestions on how to reduce duplication.

Note that once duplicate assets are created, they will not resolve themselves automatically at a later date; scans and imports only compare new incoming data with existing assets. There's no process to compare every existing asset with every other existing asset for possible matches, as that would require infeasible amounts of computer resources.

In some cases you may get duplicate assets for nonexistent systems — that is, where there are known to be no systems present at the appropriate IP address. This is typically caused by routers and proxies; see below.

Duplicates caused by scanning multiple sites

The most common cause of duplicate assets in the runZero inventory is scanning the same devices from multiple sites. A runZero site represents a site network, a distinct network whose IP addresses may overlap with those of any other site. Therefore an address like 10.1.2.3 in site A's network will be treated as completely separate from 10.1.2.3 in site B, even if the asset looks the same.

You can generally tell if this is the issue by looking at the site names of the assets, and examining information such as the MAC addresses to determine whether they really are the same device.

Router and firewall issues

Another common cause of duplicate assets is enterprise routers and firewalls. Some examples:

  • Some enterprise routers, like Cisco ASA devices, are designed to reply to all unexpected attempts on a particular port with a TCP reset (RST).
  • Fortinet Fortigate firewalls are often configured to respond to every IP address on port 8008 for their policy override API.
  • Some SIP gateways listen to SIP traffic on all addresses and automatically respond to it. Common ports for VoIP gateway responses are 1720/tcp (H.323 call setup) and 5060/tcp (SIP).
  • Some networks are set up with web proxies that respond to traffic sent to any address, including internal addresses.

runZero will generally detect when a router or firewall is replying to every connection attempt, and will avoid creating assets based on those responses. It can't always do so safely, however, because it might risk losing information about actual devices. So if you have a network appliance that runZero doesn’t detect is spoofing responses, or if genuine device responses are mixed in with the spoof responses, you may end up with a substantial number of identical assets in your inventory.

In some cases, proxies can become overloaded by scan traffic, and start only responding to some IP addresses, leaving random gaps. This will typically prevent runZero from catching them.

The best solution is to configure the device to ignore the machine hosting the runZero Explorer and not respond to it. This is because even if runZero detects the spoofing, the connections and responses still take up network bandwidth, and can make your scans run very slowly. For web proxies, it may be unintentional that they are responding to internal network connections — typically a proxy is only needed for connections to external web sites.

Here are a few workarounds if you can’t prevent your device from replying to all connections:

  • Exclude the ports the device responds to from the scan configuration.
  • Exclude all or part of the router’s IP address range from the scan.
  • Create a post-scan rule to delete any assets within the subnet that have the affected ports open.

If you need help bulk-deleting unwanted records, please contact our support team.

Assets not matching across inventory sources

If you have enabled one or more integrations with third party sources, you may find that you get duplicate assets because the imported device information fails to match the information from the runZero scan. Many integration sources provide very limited information that can be used to reliably match up devices, or identify devices inaccurately. This can prevent imported data from matching up with runZero scans. This problem is more common if you set up an external integration first, before performing any runZero scans.

Note that connector tasks always operate across multiple sites, as external sources do not have any knowledge of runZero site definitions. If you configure a connector task with a specific site, that site is used as a default for new assets when no matching existing asset is found in any site. If an asset is first created because of a connector task, it may not end up in the site representing its actual network location. This will then mean it fails to match up when a runZero scan is performed.

One way to avoid this is to remove unnecessary sites — if IP ranges aren't actually distinct (possibly overlapping) networks, they don't need to be configured as sites, and doing so will make it more likely that you encounter issues with duplicate assets.

Another way to reduce duplicates caused by integrations is to use the connector task "Exclude unknown assets" option. This will avoid creating new assets based only on third party data, instead waiting until a runZero scan has identified the asset and then merging in the third party data next time the connector task runs.

The runZero engineering team are continually updating the algorithms used to match up third party data. If the assets in question were created from a third party import some time ago, the easiest fix is often to delete them and let the software try again with the latest algorithms.

Finding duplicate assets

There are asset search keywords you can use to look for possible duplicate assets:

  • mac_overlap
  • name_overlap
  • address_overlap
  • address_extra_overlap

These search options will tend to bring up false positives when used in isolation. For example, it's common for many IP addresses to resolve to the same domain name via reverse DNS, in which case name_overlap will find all of the systems with the same DNS name. Combining the search keywords can lead to more accurate results. For example:

name_overlap:t and (address_overlap:t or address_extra_overlap:t)


Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select atleast one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article