a

2024-12-08 18:18:11 +01:00
parent 27b50b6b08
commit eeb8580a31
13 changed files with 833 additions and 58 deletions
--- a/Science/4/4.md
+++ b/Science/4/4.md
@@ -0,0 +1,117 @@
+To pseudonymize the "name" values in the given dataset using the specified hash function, let's follow these steps:
+
+1. **Convert each character in the name to its alphabetic position using the `alph` function**, where:
+   - `A = 1`, `B = 2`, `C = 3`, and so on, up to `Z = 26`.
+
+2. **Compute the hash value** using the formula:
+3. $$
+   h(s) = \sum_{i} \text{alph}(s_i) \mod 13
+   $$
+   where \( s_i \) is the \( i \)-th character in the string.
+
+3. **Add a salt** to the hash value. The salt is a random value that is combined with the hash to make it more secure.
+
+Let's apply these steps to each name in the dataset.
+
+### Step-by-Step Calculation
+
+For each name, we'll calculate the hash value before adding a salt.
+
+#### 1. Name: Franz
+- `F = 6`, `R = 18`, `A = 1`, `N = 14`, `Z = 26`
+- Sum of positions: \( 6 + 18 + 1 + 14 + 26 = 65 \)
+- Hash: \( 65 \mod 13 = 0 \)
+
+#### 2. Name: Antje
+- `A = 1`, `N = 14`, `T = 20`, `J = 10`, `E = 5`
+- Sum of positions: \( 1 + 14 + 20 + 10 + 5 = 50 \)
+- Hash: \( 50 \mod 13 = 11 \)
+
+#### 3. Name: Alex
+- `A = 1`, `L = 12`, `E = 5`, `X = 24`
+- Sum of positions: \( 1 + 12 + 5 + 24 = 42 \)
+- Hash: \( 42 \mod 13 = 3 \)
+
+### Pseudonymized Dataset with Hash Values
+We can now add the calculated hash values to the dataset. We'll also mention that a salt should be added to these hash values for increased security in a real implementation:
+
+| Original Name | Height | Shoe Size | Hash Value (no salt) |
+|---------------|--------|-----------|---------------------|
+| Franz         | 165    | 40        | 0                   |
+| Antje         | 170    | 39        | 11                  |
+| Alex          | 174    | 42        | 3                   |
+
+To finalize the pseudonymization process, a random salt should be combined with these hash values, ensuring that even if two identical names are processed, they won't result in the same pseudonymized value.
+
+Let's go through the steps for this exercise.
+
+### Given Data
+The weights of the luggage are:
+```
+22, 44, 11, 19, 21, 17, 17, 11, 11, 19, 22, 17
+```
+
+### Categories:
+- **Light**: less than 15 kg
+- **Normal**: between 15 and 20 kg
+- **Overweight**: more than 20 kg
+
+### Part (a): Absolute and Relative Frequencies
+
+#### Step 1: Categorize the weights
+We will count how many weights fall into each category.
+
+| Category   | Weight Values                               | Frequency (Absolute) |
+|------------|--------------------------------------------|---------------------|
+| Light      | 11, 11, 11                                  | 3                   |
+| Normal     | 19, 17, 17, 17, 19                          | 5                   |
+| Overweight | 22, 44, 21, 22                              | 4                   |
+
+#### Step 2: Calculate Relative Frequencies
+Relative frequency is calculated as:
+$$
+\text{Relative Frequency} = \frac{\text{Absolute Frequency}}{\text{Total Number of Weights}}
+$$
+
+Total number of weights = 12
+
+| Category   | Absolute Frequency | Relative Frequency |
+|------------|--------------------|-------------------|
+| Light      | 3                  | \( \frac{3}{12} = 0.25 \) or 25%   |
+| Normal     | 5                  | \( \frac{5}{12} \approx 0.42 \) or 42%   |
+| Overweight | 4                  | \( \frac{4}{12} \approx 0.33 \) or 33%   |
+
+### Part (b): Empirical Distribution Function and Question
+
+The empirical distribution function (EDF) represents the cumulative frequency of the dataset.
+
+Let's arrange the weights in increasing order:
+```
+11, 11, 11, 17, 17, 17, 19, 19, 21, 22, 22, 44
+```
+
+The EDF for these weights can be expressed as:
+- Less than or equal to 11 kg: 3/12 = 0.25
+- Less than or equal to 17 kg: 6/12 = 0.5
+- Less than or equal to 19 kg: 8/12 = 0.67
+- Less than or equal to 21 kg: 9/12 = 0.75
+- Less than or equal to 22 kg: 11/12 = 0.92
+- Less than or equal to 44 kg: 12/12 = 1
+
+#### Question: What is the proportion of weights that are less than 18 kg or more than 23 kg?
+- **Weights less than 18 kg:** 6 out of 12 = \( \frac{6}{12} = 0.5 \) or 50%
+- **Weights more than 23 kg:** 1 out of 12 = \( \frac{1}{12} \approx 0.08 \) or 8%
+
+Proportion of weights that are less than 18 kg or more than 23 kg:
+$$
+0.5 + 0.08 = 0.58 \text{ or 58%}
+$$
+
+### Visualization: Bar Chart and Histogram
+
+Now, let's create a bar chart and histogram for the weight categories. I will generate these charts using the data provided.
+<meta http-equiv="content-type" content="text/html; charset=utf-8"><img class="my-1 max-h-full max-w-full" alt="Output image" src="https://files.oaiusercontent.com/file-1NLAbWjJbRoIpOpkPlGyFDgJ?se=2024-10-14T09%3A24%3A00Z&amp;sp=r&amp;sv=2024-08-04&amp;sr=b&amp;rscc=max-age%3D299%2C%20immutable%2C%20private&amp;rscd=attachment%3B%20filename%3Defbc9b60-7fa3-4779-8424-e1161ace057d&amp;sig=L/65MC847FN1r7DOH9Y4zLV9NIiqPFMxLg8iAfU3LGU%3D">
+<meta http-equiv="content-type" content="text/html; charset=utf-8"><img class="my-1 max-h-full max-w-full" alt="Output image" src="https://files.oaiusercontent.com/file-X3OdJTHxHl2hGw2cSgzDuKum?se=2024-10-14T09%3A24%3A00Z&amp;sp=r&amp;sv=2024-08-04&amp;sr=b&amp;rscc=max-age%3D299%2C%20immutable%2C%20private&amp;rscd=attachment%3B%20filename%3Dcf1e314c-a836-410f-a4c0-5ff900ff9d6e&amp;sig=p6mA9gHjDbY2VcS/6nAqcBCs7BCJpJ93vcRs/1ryKVs%3D">
+The bar chart displays the absolute frequencies of each weight category, while the histogram shows the distribution of luggage weights according to the defined categories (light, normal, and overweight).
+
+These visualizations help in understanding how the weights are distributed across the categories.