benfords_law package¶
benfords_law.benfords_law module¶
Module contents¶
-
class
benfords_law.BenfordsLaw(data: Union[list, numpy.array, pandas.core.series.Series])¶ Bases:
objectNewcomb-Benford’s Law Analysis
Takes a list/array of numbers representing some real world dataset of numbers and analyzes to asses whether it fits the Newcomb-Benford’s Law (also known as the Law of Analogous Numbers). Fit is currently determined by either running a statistical goodness-of-fit test, or by running a visual test by plotting the actual distribution of first-significant digits in the dataset against the expected distribution according to Benford’s Law.
-
apply_benfords_law()¶ Runs all relevant processes and then applies all tests to input dataset
-
apply_chi_sq_test(alpha=0.05) → Tuple[float, float]¶ Apply Chi-Squared Goodness of fit test to test if the dataset’s first significant digit distribution meets the expectation of Benford’s Law. It passes the test if the p-value is greater than specified alpha and fails otherwise.
- Parameters
alpha – Optional. Specifies the required significance level based on which the null hypothesis is rejected or failed to reject. Default = 0.05
- Returns
Chi-Squared statistic, p-value
-
apply_visual_test(figsize: Tuple[int, int] = 15, 7)¶ Plot first significant digit distribution against the expectation of Benford’s Law
- Parameters
figsize – Dimensions of the figure to plot in the format: (width, height)
-
get_counts() → Dict[str, int]¶ Get frequency of first significant digits passed in the dataset.
- Returns
key pair value of each first significant digit and it’s respective frequency
-
get_distribution() → Dict[str, float]¶ Get percentage distribution of first significant digits passed in the dataset
- Returns
key pair value of each first significant digit and it’s respective percentage
-
prepare_actual_distribution(get_fsd_counts: bool = False)¶
-