+ 6

C++ vs Python3 (strings)

Imagine that you have a given string "s". Imagine you've been instructed to write a concise program that converts every letter of the string to a lowercase letter. Since Python is known for its concise programming, the instructions given to you could be completed via a single line of code ... s = s.lower() In C++, it's very easy to complete the instructions, but can it be JUST AS concise as Python? Take for example ... for (int x = 0; x < s.length(); x++) s[x] = tolower(s[x]); The example code is simple enough, but wouldn't you think that there would be a much simpler and faster way to do the above? I thought for sure that the following code would work in C++, but was wrong: s = tolower(s); I'm going to take a wild guess and assume that the difference in string-manipulation between these two languages has to do with iteration via mutable/immutable strings, or the opposite...? Why wouldn't the following C++ code work either?: for (auto x : s) x = tolower(x); If you can think of a way to complete the example instructions from above using C++, and make it JUST AS concise as Python's example-code, feel free to enlighten me on what I'm missing. Else, if it can't be done yet you know why it can't be done, also feel free to enlighten me as to why that is. Thanks!

6th Nov 2017, 8:05 PM
Fox
Fox - avatar
3 Réponses
+ 6
It does work, just be sure to include the proper library to use tolower(). It can be done on a single line, but unlike python, c requires precise instruction. It's also faster than Python since it's using pointers. while( *p) printf("%c",tolower(*p++)); Here's an example: #include<stdio.h> #include<ctype.h> int main() { char *p = "heLLo WoRLD"; while( *p) printf("%c",tolower(*p++)); return 0; } EDIT* On the topic of differences between the two language when dealing with string manipulation .... Strings by default, are immutable. It's essentially a fixed array, each element containing the character, which then refers to the binary equivalent . This applies for python, java, especially C/C++, and more. Python gives the illusion that strings can be manipulated (adding, decreasing, etc) without having to do any extra steps. In the background, it is creating a new array with several safeguard steps to ensure a user-friendly experience. C does the same thing, except as the programmer, you are handling the strings directly all the way down to the basic level with characters and their memory location. When you create a string in C, you are creating an array of char's (characters) that are elements of the whole array (a word). As for how they iterate through them, on a personal level I have no idea how Python does it in the background because I'm too busy to read its source code. Most likely the same way as C/C++, but with several more steps for convenience, safety, and what not (because Python).
6th Nov 2017, 9:20 PM
Sapphire
+ 6
btw, I traced Python's internal implementation. The old way looked horrible (for loops, methodical range testing, new object creation)... But I was surprised to find in the current sources: Each letter is treated as unsigned char (byte)... Used as index 0-255 of an array[256]... Where all bytes are replaced with themselves... Except uppercase values are remapped: lookup[0 - 64] = 0 - 64 lookup[65] = 0x61 # 65 (A) = 97 (a) lookup[90] = 0x7A # 90 (Z) = 122 (z) lookup[91 - 255] = 91 - 255 With no wasted testing, I believe this makes it O(n). Partial reference, the hex sequence abruptly changes @ line 154 (for upper->lowercase) here: https://hg.python.org/releasing/3.5.1/file/tip/Python/pyctype.c#l154 The toupper() mapping table uses the inverse application.
7th Nov 2017, 2:41 AM
Kirk Schafer
Kirk Schafer - avatar
+ 4
Oh, that stop condition is really clever. Lovely solution.
6th Nov 2017, 9:59 PM
Kirk Schafer
Kirk Schafer - avatar