Reference
BestParameterFinder
Source code in code\bpf.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 |
|
__init__(metric=None)
Initializes the BestParameterFinder.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
metric
|
Optional[Callable[[BestParameterFinder, ndarray], float]]
|
A custom metric function. Defaults to |
None
|
Source code in code\bpf.py
15 16 17 18 19 20 21 22 23 24 25 |
|
bestParameterFinder(landmarks, data, minBound=-25, maxBound=-1, granularity=5, epsilon=1, approx=None)
Finds optimal (C, P_G) parameters minimizing the metric.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
landmarks
|
List
|
Landmarks to use in solving. |
required |
data
|
Union[Data, Partitions]
|
Input dataset. |
required |
minBound
|
float
|
Minimum log-bound for search. Defaults to -25. |
-25
|
maxBound
|
float
|
Maximum log-bound for search. Defaults to -1. |
-1
|
granularity
|
int
|
Granularity of grid search. Defaults to 5. |
5
|
epsilon
|
float
|
Precision threshold. Defaults to 1. |
1
|
approx
|
Optional[int]
|
Approximation iteration count. Defaults to None. |
None
|
Returns:
Type | Description |
---|---|
tuple[float, float]
|
tuple[float, float]: Best (C, P_G) parameters (in real scale). |
Source code in code\bpf.py
190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 |
|
calculateFor(landmarks, data, c, p_g, approx=False, approx_epsilon=None, approx_iters=None)
Calculates voltages and applies the metric.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
landmarks
|
List
|
Landmarks to add to the problem. |
required |
data
|
Union[Data, Partitions]
|
Input data. |
required |
c
|
float
|
Kernel parameter (log space). |
required |
p_g
|
float
|
Resistance to ground (log space). |
required |
approx
|
bool
|
Whether to use approximation. Defaults to False. |
False
|
approx_epsilon
|
Optional[float]
|
Epsilon value for approximation. |
None
|
approx_iters
|
Optional[int]
|
Number of approximation iterations. |
None
|
Returns:
Type | Description |
---|---|
Union[float, tuple[ndarray, Problem]]
|
Union[float, tuple[np.ndarray, voltage.Problem]]: Metric value or voltages and problem. |
Source code in code\bpf.py
131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 |
|
expWithStd(voltages, base=10)
Computes the normalized exponential distance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
voltages
|
ndarray
|
Array of voltage values. |
required |
base
|
float
|
Base of the exponential. Defaults to 10. |
10
|
Returns:
Name | Type | Description |
---|---|---|
float |
float
|
Normalized exponential distance. |
Source code in code\bpf.py
100 101 102 103 104 105 106 107 108 109 110 111 |
|
median(voltages, value=0.5)
Computes the absolute difference between the median voltage and a given value.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
voltages
|
ndarray
|
Array of voltage values. |
required |
value
|
float
|
Value to compare the median to. Defaults to 0.5. |
0.5
|
Returns:
Name | Type | Description |
---|---|---|
float |
float
|
Absolute difference from the median. |
Source code in code\bpf.py
58 59 60 61 62 63 64 65 66 67 68 69 70 |
|
minWithStd(voltages, value=0.1)
Computes the normalized difference between the minimum voltage and a given value.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
voltages
|
ndarray
|
Array of voltage values. |
required |
value
|
float
|
Value to compare the minimum to. Defaults to 0.1. |
0.1
|
Returns:
Name | Type | Description |
---|---|---|
float |
float
|
Normalized absolute difference using standard deviation. |
Source code in code\bpf.py
86 87 88 89 90 91 92 93 94 95 96 97 98 |
|
minimum(voltages, value=0.1)
Computes the absolute difference between the minimum voltage and a given value.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
voltages
|
ndarray
|
Array of voltage values. |
required |
value
|
float
|
Value to compare the minimum to. Defaults to 0.1. |
0.1
|
Returns:
Name | Type | Description |
---|---|---|
float |
float
|
Absolute difference from the minimum. |
Source code in code\bpf.py
72 73 74 75 76 77 78 79 80 81 82 83 84 |
|
nInfExp(voltages, base=10)
Computes the infinity-norm distance between voltages and an exponential distribution.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
voltages
|
ndarray
|
Array of voltage values. |
required |
base
|
float
|
Base of the exponential function. Defaults to 10. |
10
|
Returns:
Name | Type | Description |
---|---|---|
float |
float
|
Infinity-norm distance. |
Source code in code\bpf.py
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
|
nInfUniform(voltages)
Computes the infinity-norm distance between voltages and a uniform distribution.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
voltages
|
ndarray
|
Array of voltage values. |
required |
Returns:
Name | Type | Description |
---|---|---|
float |
float
|
Infinity-norm distance. |
Source code in code\bpf.py
27 28 29 30 31 32 33 34 35 36 37 38 39 |
|
setKernelParameter(c)
Sets the kernel parameter.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
c
|
float
|
Kernel parameter (logarithmic scale will be used). |
required |
Source code in code\bpf.py
122 123 124 125 126 127 128 129 |
|
setResistanceToGround(p_g)
Sets the resistance to ground parameter.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
p_g
|
float
|
Resistance to ground value (logarithmic scale will be used). |
required |
Source code in code\bpf.py
113 114 115 116 117 118 119 120 |
|
visualizations(voltages, fileStarter)
Generates and saves PCA and MDS visualizations of the voltage data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
voltages
|
List[ndarray]
|
List of voltage arrays. |
required |
fileStarter
|
str
|
File name prefix for saving plots. |
required |
Returns:
Type | Description |
---|---|
None
|
None |
Source code in code\bpf.py
247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 |
|
Data
Class for handling and processing data sets.
Source code in code\create_data.py
181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 |
|
__getitem__(index)
Allows indexing into the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
index
|
int
|
Index of the desired data point. |
required |
Returns:
Type | Description |
---|---|
np.ndarray: The data point at the given index. |
Source code in code\create_data.py
219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 |
|
__init__(arg=None, stream=False)
Initializes the Data object from a list, file path, or raw data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
arg
|
Union[list, str, Any]
|
The input data or path to data file. |
None
|
stream
|
bool
|
Whether to use streaming mode for large files. |
False
|
Source code in code\create_data.py
183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 |
|
__iter__()
Returns an iterator over the dataset for use in for-loops.
Returns:
Name | Type | Description |
---|---|---|
Iterator |
An iterator over the dataset. |
Source code in code\create_data.py
253 254 255 256 257 258 259 260 261 262 263 264 265 266 |
|
__len__()
Returns the length of the dataset.
Returns:
Name | Type | Description |
---|---|---|
int |
The number of data points. |
Source code in code\create_data.py
210 211 212 213 214 215 216 217 |
|
__next__()
Retrieves the next data point in an iteration.
Returns:
Type | Description |
---|---|
np.ndarray: The next data point. |
Raises:
Type | Description |
---|---|
StopIteration
|
If the end of the dataset is reached. |
Source code in code\create_data.py
268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 |
|
__setitem__(index, value)
Sets a value in the dataset at a specified index.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
index
|
int
|
The index to modify. |
required |
value
|
Any
|
The new value to set. |
required |
Source code in code\create_data.py
243 244 245 246 247 248 249 250 251 |
|
data_function(file, save_or_load)
Routes file operation to appropriate function based on file extension.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file
|
str
|
File path. |
required |
save_or_load
|
int
|
1 for save, 2 for load. |
required |
Returns:
Type | Description |
---|---|
Optional[Any]: The result of the load operation if applicable. |
Source code in code\create_data.py
408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 |
|
getNumpy()
Ensures that the dataset is returned as a NumPy array.
Returns:
Type | Description |
---|---|
np.ndarray: Dataset as a NumPy array. |
Source code in code\create_data.py
473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 |
|
getSubSet(indexList)
Returns a subset of the data given a list of indices.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
indexList
|
list[int]
|
List of indices to extract. |
required |
Returns:
Name | Type | Description |
---|---|---|
Data |
A new Data object containing the selected subset. |
Source code in code\create_data.py
291 292 293 294 295 296 297 298 299 300 301 302 303 304 |
|
get_random_point()
Returns a randomly selected point from the dataset.
Returns:
Type | Description |
---|---|
np.ndarray: A random data point. |
Source code in code\create_data.py
455 456 457 458 459 460 461 462 |
|
load_data(input_file)
Loads the dataset from a file, choosing format by extension.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_file
|
str
|
Path to the input file. |
required |
Returns:
Name | Type | Description |
---|---|---|
Data |
Self (for chaining). |
Source code in code\create_data.py
442 443 444 445 446 447 448 449 450 451 452 453 |
|
load_data_json(input_file)
Loads the dataset from a JSON file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_file
|
str
|
Path to the input file. |
required |
Returns:
Type | Description |
---|---|
list[np.ndarray]: The loaded data. |
Source code in code\create_data.py
327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 |
|
load_data_pickle(input_file)
Loads the dataset from a pickle file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_file
|
str
|
Path to the input file. |
required |
Returns:
Name | Type | Description |
---|---|---|
Any |
The loaded data. |
Source code in code\create_data.py
348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 |
|
plot(name=None)
Plots the dataset using matplotlib.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
Optional[str]
|
File path to save the plot, if specified. |
None
|
Source code in code\create_data.py
464 465 466 467 468 469 470 471 |
|
save_data(output_file)
Saves the dataset to a file, choosing format by extension.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output_file
|
str
|
Path to the output file. |
required |
Returns:
Name | Type | Description |
---|---|---|
Data |
Self (for chaining). |
Source code in code\create_data.py
429 430 431 432 433 434 435 436 437 438 439 440 |
|
save_data_json(output_file)
Saves the dataset to a JSON file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output_file
|
str
|
Path to the output file. |
required |
Source code in code\create_data.py
306 307 308 309 310 311 312 313 314 315 |
|
save_data_pickle(output_file)
Saves the dataset to a pickle file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output_file
|
str
|
Path to the output file. |
required |
Source code in code\create_data.py
317 318 319 320 321 322 323 324 325 |
|
stream_data_json(input_file)
Streams data from a JSON file one entry at a time.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_file
|
str
|
Path to the input JSON file. |
required |
Yields:
Type | Description |
---|---|
np.ndarray: A single data point from the dataset. |
Source code in code\create_data.py
364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 |
|
DataCreator
A utility class to create various synthetic datasets for testing and analysis. Interfaces with FileGenerator to optionally stream data to file.
Attributes:
Name | Type | Description |
---|---|---|
fg |
FileGenerator
|
An instance of FileGenerator used for generating data points. |
Source code in code\create_data.py
651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 |
|
create_dataset_eigth_sphere(output_file=None, radius=1, x_pos=True, y_pos=True, z_pos=True, points=1000, seed=42, stream=False)
Creates a dataset on an eighth of a sphere.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output_file
|
str
|
File path to save the dataset. |
None
|
radius
|
float
|
Radius of the sphere. |
1
|
x_pos
|
bool
|
Use positive x. |
True
|
y_pos
|
bool
|
Use positive y. |
True
|
z_pos
|
bool
|
Use positive z. |
True
|
points
|
int
|
Number of data points. |
1000
|
seed
|
int
|
Random seed. |
42
|
stream
|
bool
|
Whether to stream to file. |
False
|
Returns:
Name | Type | Description |
---|---|---|
Data |
Data
|
The generated dataset. |
Source code in code\create_data.py
777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 |
|
create_dataset_line(output_file=None, start=0, end=1, points=1000, seed=42, stream=False)
Creates a 1D line dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output_file
|
str
|
File path to save the dataset. |
None
|
start
|
float
|
Starting point of the line. |
0
|
end
|
float
|
Ending point of the line. |
1
|
points
|
int
|
Number of data points. |
1000
|
seed
|
int
|
Random seed. |
42
|
stream
|
bool
|
Whether to stream to file. |
False
|
Returns:
Name | Type | Description |
---|---|---|
Data |
Data
|
The generated dataset. |
Source code in code\create_data.py
690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 |
|
create_dataset_spiral(output_file=None, radius=1, center=[0, 0], rotations=3, height=10, points=1000, seed=42, stream=False)
Creates a 3D spiral dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output_file
|
str
|
File path to save the dataset. |
None
|
radius
|
float
|
Radius of the spiral. |
1
|
center
|
list
|
Center offset. |
[0, 0]
|
rotations
|
int
|
Number of rotations. |
3
|
height
|
float
|
Height of the spiral. |
10
|
points
|
int
|
Number of data points. |
1000
|
seed
|
int
|
Random seed. |
42
|
stream
|
bool
|
Whether to stream to file. |
False
|
Returns:
Name | Type | Description |
---|---|---|
Data |
Data
|
The generated dataset. |
Source code in code\create_data.py
882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 |
|
create_dataset_square_edge(output_file=None, p1=(0, 0), p2=(1, 1), points=1000, seed=42)
Creates a dataset of points along the edges of a square.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output_file
|
str
|
File path to save the dataset. |
None
|
p1
|
tuple
|
Bottom-left corner. |
(0, 0)
|
p2
|
tuple
|
Top-right corner. |
(1, 1)
|
points
|
int
|
Number of data points. |
1000
|
seed
|
int
|
Random seed. |
42
|
Returns:
Name | Type | Description |
---|---|---|
Data |
Data
|
The generated dataset. |
Source code in code\create_data.py
707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 |
|
create_dataset_square_fill(output_file=None, p1=(0, 0), p2=(1, 1), points=1000, seed=42)
Creates a dataset of points filling a square area.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output_file
|
str
|
File path to save the dataset. |
None
|
p1
|
tuple
|
Bottom-left corner. |
(0, 0)
|
p2
|
tuple
|
Top-right corner. |
(1, 1)
|
points
|
int
|
Number of data points. |
1000
|
seed
|
int
|
Random seed. |
42
|
Returns:
Name | Type | Description |
---|---|---|
Data |
Data
|
The generated dataset. |
Source code in code\create_data.py
748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 |
|
create_dataset_strong_clusters(output_file=None, internal_std=1, external_std=10, mean=[0, 0], clusters=10, points=1000, seed=42, stream=False)
Creates a clustered dataset with multiple clusters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output_file
|
str
|
File path to save the dataset. |
None
|
internal_std
|
float
|
Standard deviation inside a cluster. |
1
|
external_std
|
float
|
Spread of cluster centers. |
10
|
mean
|
list
|
Mean location for generating cluster centers. |
[0, 0]
|
clusters
|
int
|
Number of clusters. |
10
|
points
|
int
|
Number of data points. |
1000
|
seed
|
int
|
Random seed. |
42
|
stream
|
bool
|
Whether to stream to file. |
False
|
Returns:
Name | Type | Description |
---|---|---|
Data |
Data
|
The generated dataset. |
Source code in code\create_data.py
812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 |
|
create_dataset_triangle(output_file=None, edges=[[0, 0], [1, 1], [2, 0]], points=1000, seed=42, stream=False)
Creates a dataset of points on a triangle.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output_file
|
str
|
File path to save the dataset. |
None
|
edges
|
list
|
Three vertices of the triangle. |
[[0, 0], [1, 1], [2, 0]]
|
points
|
int
|
Number of data points. |
1000
|
seed
|
int
|
Random seed. |
42
|
stream
|
bool
|
Whether to stream to file. |
False
|
Returns:
Name | Type | Description |
---|---|---|
Data |
Data
|
The generated dataset. |
Source code in code\create_data.py
796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 |
|
rotate_into_dimention(data, higher_dim=3, seed=42)
Rotates dataset into a higher dimensional space using random rotations.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
Data
|
The dataset to rotate. |
required |
higher_dim
|
int
|
Dimension to rotate into. |
3
|
seed
|
int
|
Random seed. |
42
|
Returns:
Name | Type | Description |
---|---|---|
Data |
Data
|
The rotated dataset. |
Source code in code\create_data.py
847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 |
|
stream_dataset_creator(output_file, function, seed, stream, *args)
Creates a dataset using the specified generator function, supporting streamed or non-streamed output.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output_file
|
str
|
File path to save the dataset. |
required |
function
|
callable
|
Generator function to create data points. |
required |
seed
|
int
|
Random seed for reproducibility. |
required |
stream
|
bool
|
If True, streams data directly to the file. |
required |
*args
|
Additional arguments passed to the generator function. |
()
|
Returns:
Name | Type | Description |
---|---|---|
Data |
Data
|
The created dataset, either streamed or in-memory. |
Source code in code\create_data.py
663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 |
|
FileGenerator
Generates files for saved data.
This class is designed to assist in saving generated datasets in a streaming
fashion. It provides several built-in generators to create synthetic datasets
for use with Data
and DataCreator
classes.
Source code in code\create_data.py
491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 |
|
__init__()
Initializes the FileGenerator.
Source code in code\create_data.py
500 501 502 |
|
eigth_sphere_generator(radius, x_pos, y_pos, z_pos, points)
Generates points on an eighth of a sphere surface.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
radius
|
float
|
Radius of the sphere. |
required |
x_pos
|
int
|
Hemisphere direction for X (0 or 1). |
required |
y_pos
|
int
|
Hemisphere direction for Y (0 or 1). |
required |
z_pos
|
int
|
Hemisphere direction for Z (0 or 1). |
required |
points
|
int
|
Number of points to generate. |
required |
Yields:
Type | Description |
---|---|
np.ndarray: Points on the eighth sphere surface. |
Source code in code\create_data.py
564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 |
|
line_generator(start, end, points)
Generates points along a line in 1D space.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
start
|
float
|
Starting point of the line. |
required |
end
|
float
|
Ending point of the line. |
required |
points
|
int
|
Number of points to generate. |
required |
Yields:
Type | Description |
---|---|
np.ndarray: Single-point arrays sampled along the line. |
Source code in code\create_data.py
549 550 551 552 553 554 555 556 557 558 559 560 561 562 |
|
linear_generator(data)
Yields data points one by one from a NumPy array.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
ndarray
|
Input data. |
required |
Yields:
Type | Description |
---|---|
np.ndarray: Single data points from the array. |
Source code in code\create_data.py
536 537 538 539 540 541 542 543 544 545 546 547 |
|
setGenerator(fn)
Sets the generator function to be used when saving data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fn
|
Callable
|
A generator function that yields data points. |
required |
Source code in code\create_data.py
504 505 506 507 508 509 510 511 |
|
spiral_generator(radius, center, rotations, height, points)
Generates points forming a 3D spiral (helix).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
radius
|
float
|
Radius of the spiral. |
required |
center
|
list
|
Center offset of the spiral (not used directly in current implementation). |
required |
rotations
|
int
|
Number of full 360° turns. |
required |
height
|
float
|
Total height of the spiral. |
required |
points
|
int
|
Number of points to generate. |
required |
Yields:
Type | Description |
---|---|
np.ndarray: Points along the spiral. |
Source code in code\create_data.py
627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 |
|
stream_save(output_file, *args)
Saves data to a JSON file in a streaming manner.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output_file
|
str
|
Path to the file where data will be saved. |
required |
*args
|
Arguments to pass to the generator function. |
()
|
Returns:
Type | Description |
---|---|
None |
Source code in code\create_data.py
513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 |
|
strong_cluster_generator(internal_std, cluster_centers, points)
Generates clustered points around multiple centers with specified standard deviation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
internal_std
|
float
|
Standard deviation within each cluster. |
required |
cluster_centers
|
list
|
A list of cluster center points. |
required |
points
|
int
|
Number of points to generate. |
required |
Yields:
Type | Description |
---|---|
np.ndarray: Points sampled from the clusters. |
Source code in code\create_data.py
609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 |
|
triangle_generator(edges, points)
Generates points uniformly within a triangle defined by three vertices.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
edges
|
list
|
A list of three points (each a list or np.ndarray) defining the triangle. |
required |
points
|
int
|
Number of points to generate. |
required |
Yields:
Type | Description |
---|---|
np.ndarray: Points uniformly sampled inside the triangle. |
Source code in code\create_data.py
587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 |
|
Plotter
Graphs the data into different formats.
Source code in code\create_data.py
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 |
|
plotPointSets(sets, name=None)
Plots multiple sets of points in different colors.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sets
|
list[list[ndarray]]
|
A list of point sets. |
required |
name
|
Optional[str]
|
Optional filename to save the plot. |
None
|
Source code in code\create_data.py
93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 |
|
plotPoints(points, name=None)
Plots a single set of points in 2D or 3D.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
points
|
list[ndarray]
|
A list of points to plot. |
required |
name
|
Optional[str]
|
Optional filename to save the plot. |
None
|
Source code in code\create_data.py
83 84 85 86 87 88 89 90 91 |
|
pointFormatting(points)
Formats points into separate coordinate lists for plotting.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
points
|
list[ndarray]
|
A list of points as NumPy arrays. |
required |
Returns:
Name | Type | Description |
---|---|---|
tuple |
tuple[list[float], list[float], Optional[list[float]]]
|
x, y, and optionally z coordinate lists. |
Source code in code\create_data.py
62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 |
|
voltage_plot(solver, color='r', ax=None, show=True, label='', colored=False, name=None)
Plots voltage data overlaid on input data using optional PCA projection.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
solver
|
A voltage solver instance with |
required | |
color
|
str
|
Color for the points if |
'r'
|
ax
|
Matplotlib axis to plot on (if provided). |
None
|
|
show
|
bool
|
Whether to show the plot. |
True
|
label
|
str
|
Label for the legend. |
''
|
colored
|
bool
|
Whether to color the points by voltage values. |
False
|
name
|
Optional[str]
|
Optional filename to save the plot. |
None
|
Returns:
Type | Description |
---|---|
The axis with the plotted data. |
Source code in code\create_data.py
120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 |
|
dimentional_variation(dimentions)
Returns a NumPy array of random values from a standard normal distribution.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dimentions
|
int
|
Number of dimensions/values to return. |
required |
Returns:
Type | Description |
---|---|
ndarray
|
np.ndarray: Array of random values sampled from the standard normal distribution. |
Source code in code\create_data.py
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
|
select_random(array)
Selects a random element from an array.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
array
|
list
|
The array to select from. |
required |
Returns:
Name | Type | Description |
---|---|---|
Any |
any
|
A random element from the array. |
Source code in code\create_data.py
13 14 15 16 17 18 19 20 21 22 23 |
|
varied_point(mean, std)
Returns a point that is randomly offset from the mean based on standard deviation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mean
|
ndarray
|
The mean location of the point. |
required |
std
|
float
|
Standard deviation to apply. |
required |
Returns:
Type | Description |
---|---|
ndarray
|
np.ndarray: A randomly varied point. |
Source code in code\create_data.py
43 44 45 46 47 48 49 50 51 52 53 54 |
|
Partitions
Bases: DistanceBased
Using K-means to partition a large dataset
Source code in code\kmeans.py
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 |
|
getClosestPoints(index)
Finds the points whose closest points are the point indicated by the index
Parameters:
Name | Type | Description | Default |
---|---|---|---|
index
|
int
|
the index of the point |
required |
Returns:
Type | Description |
---|---|
List[np.ndarray]: All the points whose closest point is data[index] |
Source code in code\kmeans.py
132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 |
|
k_means(k, seed=42, savePointAssignments=False)
Runs k-means and saves the centers and point counts. With option to save pointAssignments for voronoi drawing
Source code in code\kmeans.py
67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 |
|
k_means_plus_plus(k)
The old k-means++ algorithm before using sci-kit
Source code in code\kmeans.py
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
|
my_k_means(k, seed=42, savePointAssignments=False)
The old k-means algorithm
Source code in code\kmeans.py
91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 |
|
plot(color='r', marker='o', ax=None, name=None)
Plot the kmeans
Source code in code\kmeans.py
159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 |
|
Landmark
Represents a location in the dataset where a voltage will be applied.
The index
can refer either to an individual datapoint or a partition center.
Source code in code\voltage.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 |
|
__init__(index, voltage)
Initializes a Landmark.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
index
|
int
|
Index of the datapoint or partition center. |
required |
voltage
|
float
|
Voltage to be applied at the specified index. |
required |
Source code in code\voltage.py
20 21 22 23 24 25 26 27 28 29 |
|
createLandmarkClosestTo(data, point, voltage, distanceFn=None, ignore=[])
staticmethod
Creates a Landmark at the index of the datapoint in data
closest to point
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
List[Any]
|
The dataset to search over. |
required |
point
|
Any
|
The reference point to find the closest datapoint to. |
required |
voltage
|
float
|
The voltage to assign to the resulting Landmark. |
required |
distanceFn
|
Optional[object]
|
A distance function with a |
None
|
ignore
|
List[int]
|
List of indices to skip during the search. Defaults to empty list. |
[]
|
Returns:
Name | Type | Description |
---|---|---|
Landmark |
Landmark
|
A Landmark instance corresponding to the closest datapoint. |
Source code in code\voltage.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 |
|
Problem
Bases: DistanceBased
Represents the clustering/graph problem to be solved, extending a distance-based kernel with landmarks and weights.
Source code in code\voltage.py
70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 |
|
__init__(data)
Initializes the Problem instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
Any
|
An object containing your dataset. Must support len(data) and data.getNumpy() to return an (n, d) numpy array. |
required |
Source code in code\voltage.py
76 77 78 79 80 81 82 83 84 85 86 87 88 89 |
|
addLandmark(landmark)
Adds a single Landmark to the problem.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
landmark
|
Landmark
|
The landmark instance to append. |
required |
Source code in code\voltage.py
244 245 246 247 248 249 250 251 |
|
addLandmarks(landmarks)
Adds multiple Landmark instances to the problem.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
landmarks
|
List[Landmark]
|
List of landmarks to append. |
required |
Source code in code\voltage.py
253 254 255 256 257 258 259 260 |
|
addLandmarksInRange(minRange, maxRange, voltage)
Adds landmarks for all data points within a given coordinate range.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
minRange
|
array - like
|
Minimum bounds per dimension. |
required |
maxRange
|
array - like
|
Maximum bounds per dimension. |
required |
voltage
|
float
|
Voltage to apply at each new landmark. |
required |
Returns:
Type | Description |
---|---|
List[Landmark]
|
List[Landmark]: The list of newly added landmarks. |
Source code in code\voltage.py
262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 |
|
addUniversalGround(p_g=0.01)
Adds (or updates) a 'universal ground' node connected uniformly to all others.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
p_g
|
float
|
Total ground connection probability to distribute. |
0.01
|
Returns:
Name | Type | Description |
---|---|---|
ndarray |
ndarray
|
The updated normalized weight matrix including the ground node. |
Source code in code\voltage.py
216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 |
|
efficientSquareDistance(data)
Computes the pairwise squared Euclidean distances of the rows in data
.
Uses the identity ‖x−y‖² = ‖x‖² + ‖y‖² − 2 x·y for efficiency.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
ndarray
|
Array of shape (n, d). |
required |
Returns:
Name | Type | Description |
---|---|---|
ndarray |
ndarray
|
Matrix of shape (n, n) where entry (i, j) is squared distance. |
Source code in code\voltage.py
123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 |
|
gaussiankernel(data, std)
Builds a Gaussian (RBF) kernel matrix.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
ndarray
|
Array of shape (n, d). |
required |
std
|
float
|
Standard deviation parameter for the Gaussian. |
required |
Returns:
Name | Type | Description |
---|---|---|
ndarray |
ndarray
|
Kernel matrix of shape (n, n). |
Source code in code\voltage.py
154 155 156 157 158 159 160 161 162 163 164 165 166 |
|
normalizeWeights()
Normalizes each row of the weight matrix to sum to 1.
Raises:
Type | Description |
---|---|
ValueError
|
If any row sums to zero, resulting in NaNs. |
Source code in code\voltage.py
184 185 186 187 188 189 190 191 192 193 |
|
radialkernel(data, r)
Builds a binary (0/1) radial kernel: 1 if distance ≤ r, else 0.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
ndarray
|
Array of shape (n, d). |
required |
r
|
float
|
Radius threshold. |
required |
Returns:
Name | Type | Description |
---|---|---|
ndarray |
ndarray
|
Adjacency-like matrix (n×n) of 0/1 floats. |
Source code in code\voltage.py
140 141 142 143 144 145 146 147 148 149 150 151 152 |
|
setKernel(kernel)
Sets the kernel function to use for weight computations.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
kernel
|
callable
|
A function or callable object with signature kernel(X, Y, *params) → ndarray of shape (|X|, |Y|). |
required |
Source code in code\voltage.py
113 114 115 116 117 118 119 120 121 |
|
setPartitionWeights(partition, *c)
Computes and normalizes weights based on cluster centers and sizes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
partition
|
Any
|
An object with attributes |
required |
*c
|
Any
|
Parameters to pass into the kernel function. |
()
|
Returns:
Name | Type | Description |
---|---|---|
ndarray |
ndarray
|
The normalized weight matrix for the partition block. |
Source code in code\voltage.py
195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 |
|
setWeights(*c)
Computes and normalizes the weight matrix on the original data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
*c
|
Any
|
Parameters to pass into the currently set kernel function. |
()
|
Returns:
Name | Type | Description |
---|---|---|
ndarray |
ndarray
|
The normalized weight matrix (n×n). |
Source code in code\voltage.py
168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 |
|
timeEnd(replace=True)
Computes the elapsed time since the last timeStart().
Parameters:
Name | Type | Description | Default |
---|---|---|---|
replace
|
bool
|
If True, resets the start time to now. |
True
|
Returns:
Name | Type | Description |
---|---|---|
float |
float
|
Seconds elapsed since last start. |
Source code in code\voltage.py
97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 |
|
timeStart()
Records the current time to measure elapsed intervals.
Source code in code\voltage.py
91 92 93 94 95 |
|
Solver
Bases: DistanceBased
Solves a given Problem
Source code in code\voltage.py
286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 |
|