Scan For Leaks of Personally Identifiable Information (PII)

Fluxguard utilizes machine learning (ML) to crawl an organization’s digital assets to detect inadvertent leaks of personally identifiable information (PII). Fluxguard provides IT staff with comprehensive reports and dashboards to understand the sources, scope, and potential risks of detected PII leaks.

Fluxguard’s PII scan better ensures compliance with local and Federal law—including Health Insurance Portability and Accountability Act of 1996 (HIPAA). More importantly, Fluxguard helps safeguard customers’ and citizens’ protected information, and provides them with the confidence that your organization is keeping their secrets safe.

Inadvertent leaks of personally identifiable information can cause serious financial and reputational harm to customers, businesses, and governments. By utilizing Fluxguard’s machine learning scans, businesses and governments can ensure compliance with local, state, and federal regulations, and reduce the risk of PII leaks. This helps protect customers, maintain trust, and build credibility.

Step 1: Fluxguard Crawls and Creates a Baseline of Your Organization’s Digital Footprint

Fluxguard scans everything.

Fluxguard crawls all your consumer-facing web data, including HTML, PDFs, Word documents, and any other digital representation of text.

Fluxguard downloads all assets loaded from any web page, including Javascript, style sheets, and other code that makes up the styling and functionality of your site. (These too can inadvertently contain PII.) Fluxguard scans web forms, contact us forms, or other interactivity tools that could be used to enter and submit PII data.

Fluxguard audits every page for PII.

All content is scanned for PII. Fluxguard creates a point-in-time baseline of all HTML, text, images, and other assets. This includes multiple screenshots of every page, as well as the fully rendered DOM.

By creating a baseline, Fluxguard has a point-in-time record of your digital presence. On future crawls, Fluxguard will detect PII changes to this baseline. This way, Fluxguard will not continuously alert legitimate PII inclusions on your site (such as executive bios), but will only alert additions of new PII.

For example, if any public portion of crawled content contains “123 Main Street,” Fluxguard will flag the page containing this PII. Fluxguard even allows you to control the thresholds of PII exposure, by alternately whitelisting or flagging IP addresses, full names, partial names, bank accounts, and credit card numbers. By creating a baseline and finding discrepancies in the baseline, Fluxguard is designed to detect and alert you to any potential exposure of PII on your website.

Step 2: Fluxguard Uses ML to Detect PII

Fluxguard utilizes Named Entity Recognition (NER)—a machine learning subset of Natural Language Processing—to detect PII changes across your digital portfolio. Instead of using brittle dictionaries of names, for example, machine learning can detect new names or sequences never encountered before. This versatility makes it ideal for PII detection in multiple languages.

Fluxguard’s machine learning models are constantly being trained and improved to better recognize PII and its variants. Additionally, it combines its machine learning algorithms with intelligent alerting features, allowing users to immediately act on any suspicious PII change.

Fluxguard can be easily configured to detect all or some of the following:

  • ADDRESS A physical address, such as “100 Main Street, Anytown, USA” or “Suite #12, Building 123.” An address can include information such as the street, building, location, city, state, country, county, zip code, precinct, and neighborhood.
  • AGE An individual’s age, including the quantity and unit of time. For example, in the phrase “I am 40 years old,” Fluxguard recognizes “40 years” as an age.
  • AWS_ACCESS_KEY A unique identifier that’s associated with a secret access key; you use the access key ID and secret access key to sign programmatic AWS requests cryptographically.
  • AWS_SECRET_KEY A unique identifier that’s associated with an access key. You use the access key ID and secret access key to sign programmatic AWS requests cryptographically.
  • CREDIT_DEBIT_CVV A three-digit card verification code (CVV) that is present on VISA, MasterCard, and Discover credit and debit cards. For American Express credit or debit cards, the CVV is a four-digit numeric code.
  • CREDIT_DEBIT_EXPIRY The expiration date for a credit or debit card. This number is usually four digits long and is often formatted as month/year or MM/YY. Fluxguard recognizes expiration dates such as 01/21, 01/2021, and Jan 2021.
  • CREDIT_DEBIT_NUMBER The number for a credit or debit card. These numbers can vary from 13 to 16 digits in length. However, Fluxguard also recognizes credit or debit card numbers when only the last four digits are present.
  • DATE_TIME A date can include a year, month, day, day of week, or time of day. For example, Fluxguard recognizes “January 19, 2020” or “11 am” as dates. Fluxguard will recognize partial dates, date ranges, and date intervals. It will also recognize decades, such as “the 1990s”.
  • DRIVER_ID The number assigned to a driver’s license, which is an official document permitting an individual to operate one or more motorized vehicles on a public road. A driver’s license number consists of alphanumeric characters.
  • EMAIL An email address, such as [email protected].
  • INTERNATIONAL_BANK_ACCOUNT_NUMBER An International Bank Account Number has specific formats in each country. See www.iban.com/structure.
  • IP_ADDRESS An IPv4 address, such as 198.51.100.0.
  • LICENSE_PLATE A license plate for a vehicle is issued by the state or country where the vehicle is registered. The format for passenger vehicles is typically five to eight digits, consisting of upper-case letters and numbers. The format varies depending on the location of the issuing state or country.
  • MAC_ADDRESS A media access control (MAC) address is a unique identifier assigned to a network interface controller (NIC).
  • NAME An individual’s name. This entity type does not include titles, such as Dr., Mr., Mrs., or Miss. Fluxguard does not apply this entity type to names that are part of organizations or addresses. For example, Fluxguard recognizes the “John Doe Organization” as an organization, and it recognizes “Jane Doe Street” as an address.
  • PASSWORD An alphanumeric string that is used as a password, such as “*very20special#pass*”.
  • PHONE A phone number. This entity type also includes fax and pager numbers.
  • PIN A four-digit personal identification number (PIN) with which you can access your bank account.
  • SWIFT_CODE A SWIFT code is a standard format of Bank Identifier Code (BIC) used to specify a particular bank or branch. Banks use these codes for money transfers such as international wire transfers. SWIFT codes consist of eight or 11 characters. The 11-digit codes refer to specific branches, while eight-digit codes (or 11-digit codes ending in ’XXX’) refer to the head or primary office.
  • URL A web address, such as www.example.com.
  • USERNAME A user name that identifies an account, such as a login name, screen name, nick name, or handle.
  • VEHICLE_IDENTIFICATION_NUMBER A Vehicle Identification Number (VIN) uniquely identifies a vehicle. VIN content and format are defined in the ISO 3779 specification. Each country has specific codes and formats for VINs.
  • CA_HEALTH_NUMBER A Canadian Health Service Number is a 10-digit unique identifier, required for individuals to access healthcare benefits.
  • CA_SOCIAL_INSURANCE_NUMBER A Canadian Social Insurance Number (SIN) is a nine-digit unique identifier, required for individuals to access government programs and benefits. The SIN is formatted as three groups of three digits, such as 123-456-789. A SIN can be validated through a simple check-digit process called the Luhn algorithm.
  • IN_AADHAAR An Indian Aadhaar is a 12-digit unique identification number issued by the Indian government to the residents of India. The Aadhaar format has a space or hyphen after the fourth and eighth digit.
  • IN_NREGA An Indian National Rural Employment Guarantee Act (NREGA) number consists of two letters followed by 14 numbers.
  • IN_PERMANENT_ACCOUNT_NUMBER An Indian Permanent Account Number is a 10-digit unique alphanumeric number issued by the Income Tax Department.
  • IN_VOTER_NUMBER An Indian Voter ID consists of three letters followed by seven numbers.
  • UK_NATIONAL_HEALTH_SERVICE_NUMBER A UK National Health Service Number is a 10-17 digit number, such as 485 777 3456. The current system formats the 10-digit number with spaces after the third and sixth digits. The final digit is an error-detecting checksum. The 17-digit number format has spaces after the 10th and 13th digits.
  • UK_NATIONAL_INSURANCE_NUMBER A UK National Insurance Number (NINO) provides individuals with access to National Insurance (social security) benefits. It is also used for some purposes in the UK tax system. The number is nine digits long and starts with two letters, followed by six numbers and one letter. A NINO can be formatted with a space or a dash after the two letters and after the second, forth, and sixth digits.
  • UK_UNIQUE_TAXPAYER_REFERENCE_NUMBER A UK Unique Taxpayer Reference (UTR) is a 10-digit number that identifies a taxpayer or a business.
  • BANK_ACCOUNT_NUMBER A US bank account number, which is typically 10 to 12 digits long. Fluxguard also recognizes bank account numbers when only the last four digits are present.
  • BANK_ROUTING A US bank account routing number. These are typically nine digits long, but Fluxguard also recognizes routing numbers when only the last four digits are present.
  • PASSPORT_NUMBER A US passport number. Passport numbers range from six to nine alphanumeric characters.
  • US_INDIVIDUAL_TAX_IDENTIFICATION_NUMBER A US Individual Taxpayer Identification Number (ITIN) is a nine-digit number that starts with a “9” and contain a “7” or “8” as the fourth digit. An ITIN can be formatted with a space or a dash after the third and forth digits.
  • SSN A US Social Security Number (SSN) is a nine-digit number that is issued to US citizens, permanent residents, and temporary working residents. Fluxguard also recognizes Social Security Numbers when only the last four digits are present.

Step 3: Fluxguard Immediately Alerts Key Staff When PII Is Detected

Fluxguard relies on change detection to surface potential PII leaks. This approach reduces repetitive alerts and false positives by reducing the monitoring surface area only to specific text, HTML code, or other changes between versions.

Fluxguard integrates with the tools you’re familiar with, including email, Webhooks, API, Slack, and more. Fluxguard highlights text changes in summary emails, showing exactly what has been added and removed between versions.

Opt to receive alerts as soon as potential PII leaks are found, or bundle them into a daily or weekly digest. Fluxguard also provides a comprehensive audit log of changes, enabling you to track all changes to every consumer-facing file over the course of its lifetime.

PII Scan requires our help to instrument Fluxguard to its full potential.

Get Started with Fluxguard Today

Get a Guided Demo

Schedule a free 30-minute meeting with us to see how Fluxguard can work for your business.

Start a Free Trial of Fluxguard

Sign up for a no-obligation trial of Fluxguard and start monitoring websites within minutes.