It's a statistical principle that describes the distribution of digits in naturally occurring data sets.
Definition
Benford's law is a statistical principle that describes the distribution of digits in naturally occurring data sets. It states that in many real-world data sets, the digit 1 is more likely to appear as the first digit than any other digit. For example, in a data set of numbers representing the populations of cities, the number 1 is more likely to appear as the first digit in the numbers representing smaller cities, while larger numbers are more likely to appear as the first digit in the numbers representing larger cities.
Use Case
One use case for Benford's law is in detecting fraudulent or irregular data. Because the distribution of digits in naturally occurring data sets follows the pattern described by Benford's law, deviations from this pattern can be an indication that the data has been manipulated or fabricated. For example, if a company reports financial data that does not conform to Benford's law, it could be an indication that the company is trying to mislead its investors or hide something.
Here are some examples of data sets that might follow Benford's law:
Populations of cities
Stock prices
Physical constants (e.g., the speed of light, the gravitational constant)
Lengths of rivers
Heights of mountains
On the other hand, data sets that have been artificially generated or manipulated might not follow Benford's law. For example, a data set of randomly generated numbers might not follow Benford's law, because the digits have been chosen randomly and do not reflect any real-world patterns.
Probabilities
According to Benford's law, the probability that a number will have a certain digit as its first digit is as follows:
30.1%
17.6%
12.5%
9.7%
7.9%
6.7%
5.8%
5.1%
4.6%
These probabilities are based on the idea that in many naturally occurring data sets, smaller numbers are more common than larger numbers. Therefore, the digit 1 is more likely to appear as the first digit in smaller numbers, while larger digits are more likely to appear as the first digit in larger numbers.
It's important to note that these probabilities are not exact and will not hold true for all data sets. However, they provide a useful benchmark for comparing the distribution of digits in different data sets and detecting deviations from expected patterns.
コメント