SPO Assignment Phase 2

Hi all. After some research on phase one. I have tested and edited some part of the strspn code in order to make it run and be optimized. But the result is not very optimistic. It turns out that some of my code is not being optimized. However, I have learned many different ways of optimizing. Because of the limited time, I have to end the process but hopefully, someone will try to continue the process and learn more from it.

Here is the first original part that I am looking at

STRSPN (const char *str, const char *accept)
{
 if (accept[0] == '\0')
 return 0;
 if (__glibc_unlikely (accept[1] == '\0'))
 {
 const char *a = str;
 for (; *str == *accept; str++);
 return str - a;
 }

This is my edited code

I removed the “__glibc_unlikely”. This might helps in the case where the prediction is wrong. then it means the processor pipeline will not need to be flushed and reduces cost several cycles. So as long as the prediction is correct most of the time this will tend to be food for performance.

I also added “__builtin_assume_aligned” to allow the compiler to assume that the returned pointer is at least align bytes aligned


STRSPN (const char *str, const char *accept)
{
if (accept[0] == '\0')
return 0;
if (accept[1] == '\0'))
{
//---const char *a = str;
const char *a = __builtin_assume_aligned(str, 16);
for (; *str == *accept; str++);
return str - a;
}

Here is the second original part I am looking at

</pre><pre>s = (unsigned char*) str; // chars which are only 1 byte (8 bits long)\
A char has a range -127 to 127 while an unsigned char has a range of 0 to 255.
if (!p[s[0]]) return 0;
if (!p[s[1]]) return 1;
if (!p[s[2]]) return 2;
if (!p[s[3]]) return 3;
//(+) replace if case with with switch case
//* * * * * * * * * * Added
switch(s){
case !p[s[0]]:
return 0;
case !p[s[1]]:
return 1;
case !p[s[2]]:
return 2;
case !p[s[3]]:
return 3;

In my last post, I decided to change the if and else to switch case But it slow down the program so I did not include it.

Next part is adding size_t. I replaced the integer type with size_t because the variables will not become the negative number.

//unsigned int c0, c1, c2, c3;
unsigned size_t c0, c1, c2, c3; // (+) Because the variables will not be negative number

For testing, I wrote a tester that will call the function 3
times and compare the time between my edited functions with the library one

 

int main() {
printf("Test 1 STRSPN \n ------------------------ \n\n");
int i;
char str[200000 ];
char set[200000 ];

memset(str, '8', 200000);
memset(set, '8', 200000);

clock_t start = clock();
clock_t diff;
i = STRSPN_GNU(str, set);
diff = clock() - start;
int msec = diff * 1000 / CLOCKS_PER_SEC;
printf("Time taken %d seconds %d milliseconds \n", msec / 1000, msec % 1000);
printf("The initial number has %d digits.\n", i);

clock_t start2 = clock(), diff2;
i = STRSPN_MY(str, set);
diff2 = clock() - start2;

int msec2 = diff2 * 1000 / CLOCKS_PER_SEC;
printf("Time taken %d seconds %d milliseconds \n", msec2 / 1000, msec2 % 1000);
printf("The initial number has %d digits.\n", i);

//===============================
printf("Test 2 STRSPN \n ------------------------ \n\n");
char str1[100000];
char set1[100000];

memset(str1, '2', 100000);
memset(set1, '2', 100000);

clock_t start1 = clock();
clock_t diff1;
i = STRSPN_GNU(str1, set1);
diff1 = clock() - start1;
int msec1 = diff1 * 1000 / CLOCKS_PER_SEC;
printf("Time taken %d seconds %d milliseconds \n", msec1 / 1000, msec1 % 1000);
printf("The initial number has %d digits.\n", i);

clock_t start3 = clock(), diff3;
i = STRSPN_MY(str1, set1);
diff3 = clock() - start3;

int msec3 = diff2 * 1000 / CLOCKS_PER_SEC;
printf("Time taken %d seconds %d milliseconds \n", msec3 / 1000, msec3 % 1000);
printf("The initial number has %d digits.\n", i);
//==============================

printf("Test 3 STRSPN \n ------------------------ \n\n");
char str5[200000];
char set5[200000];

memset(str5, "3", 200000);
memset(set5, "12345678", 200000);

clock_t start5 = clock();
clock_t diff5;
i = STRSPN_GNU(str5, set5);
diff5 = clock() - start5;
int msec5 = diff5 * 1000 / CLOCKS_PER_SEC;
printf("Time taken %d seconds %d milliseconds \n", msec5 / 1000, msec5 % 1000);
printf("The initial number has %d digits.\n", i);

clock_t start6 = clock(), diff6;
i = STRSPN_MY(str5, set5);
diff6 = clock() - start6;

int msec6 = diff6 * 1000 / CLOCKS_PER_SEC;
printf("Time taken %d seconds %d milliseconds \n", msec6 / 1000, msec6 % 1000);
printf("The initial number has %d digits.\n", i);

The result of the program is not very good. My function is the second one and it run slower than the library one. The strange thing is that I got different time for each time and run the program. So I choose the average one as the result.

Capture

Resources:

blog.man7.org

stackoverflow.com

https://en.wikipedia.org/wiki/Inline_expansion