Capitalization and Hyphenation in Python

I am self-learning Python in a online class, and there are one question that I cannot figure out:

If I want the user to enter a string. For example:

The Input string = “born on a monday in the autumn of 1997, i will celebrate my twenty Fifth birthday in the coming September by eating two third of a cheese cake.”
The Output String = “Born on a Monday in the Autumn of 1997, I will celebrate my Twenty-fifth birthday in the coming September by eating two-third of a cheese cake.”

It requires:

  1. capitalised including (i) Day of the week; (ii) Month; and (iii) Seasons;
  2. hyphenation the Ordinal numbers (eg, Forty-first, Ninety-second, etc) and Fractions (eg, half-, quarter-, one-third, three-fourteenth, …., up to N-twentieth)
  3. Capitalize the first non-space character in the string and the character after a period, exclamation mark or question mark

So far I only have:

import re

def uppercase(matchobj):
return matchobj.group(0).upper()

#Capitalize the first non-space character in the string
#Capitalize the first non-space character after a period, exclamation mark or question mark
#Capitalize a lowercase “i” if it is preceded by a space and followed by a space, full stop, exclamation mark, question mark or apostrophe.
def capitalize(s):
return re.sub(‘^([a-z])|[.|?|!]\s*([a-z])|\s+i(?![a-z])’, uppercase, s)

def main(): #
s = input("Enter string: ") #reads a string from the user
capitalized = capitalize(s) #capitalizes user’s string using the function
print(“Capitalization of letter strings:”, capitalized) #displays the result

#Call the main function
main()

First off; Python is way less complicated than that.

As an example:

user_input = input("Enter string: ")
cap_string = user_input.capitalize()
print("The capitalized string is:", cap_string)

Are you coming from a background in C?

Also, wrap your code using Markdown:

```python
# some code
```

It’s also unclear what method is being taught/learned with this exercise, as there’s more than one way to accomplish the task.

1 Like

Hi @rob42, thank you for your reply. I don’t have any prior programming background. It is the Beginning course for python and the syntax is designed by my friend. My teacher suggests me to use the simple way to solve this question. I am trying to use “if statement”, but my teacher is not suggest to use this because it will become complicated…

1 Like

Ah, I see.

Given the it’s a course aimed at beginners, with zero prior programming background, I’d be very surprised if you’re expected to come up with a solution that’s based on REGEX; more likely you’ll be learning the basic Data Types, such as ‘lists’, ‘dictionaries’, ‘sets’ and ‘tuples’; correct me if I’m wrong.

The underlying question being: where are you at and what principle are you learning with this exercise?

yes, it is! I am learning these basic notion, from now I have been learnt for 3 months. I think the outcome in this exercise will be involved dictionaries, lists and if statement, but I am not sure about it. I modify my syntax, I think it could work:

sentence = input("input your sentences: ").split()
for word in range(len(sentence)):
    if sentence[word] == "i":
        sentence[word] = "I"
    for character in sentence[word]:
        print(word, character)

Somehow, I think my syntax should involve these function, but I am not sure how to apply it

ordinal_numbers = [
    "first",
    "second",
    "third",
    "fourth",
    "fifth",
    "sixth",
    # ...
    "tenth",
    "twentieth",
    # ...
    "hundredth",
    "thousandth",
    # ...
]

week_days = [
    "sunday",
    "monday",
    # ...
]

months = [
    "january"
    # ...
]

Okay, a nice approach on which to build, but one step at a time: we’ll get to that.

You know that you can do on-the-fly conversions, right? That is to say, that any text string can be converted to, say, a ‘list’, like this:

>>> string = "I am a string"
>>> print(list(string))

… and that Lists are …

  • ordered
    • the items have a defined order, and that order will not change.
  • indexed
    • the first item has index [0], the second item has index [1] etc.
  • changeable
    • meaning that we can change, add, and remove items in a list after it has been created.

Have a look at this script and you should see how to get your project off to a good start.

I’m not going to do it all for you, as you’ll not learn so much that way, but you will learn by trying things, breaking things, asking questions and then fixing things.

#!/usr/bin/python3

# Get the user input
user_input = input("Enter string: ")

# Strip off any leading and/or trailing spaces, capitalize the first character
# and store the result in strip_and_cap_input
strip_and_cap_input = user_input.strip().capitalize()

# Display the results
print('\n'+"The user input can be a 'list', like this:"+'\n',list(user_input),'\n')

print("The stripped and capitalized string is:", strip_and_cap_input+'\n')
print("The 'list' of that is:",list(strip_and_cap_input),'\n')

print("---------------------------"+'\n')

if chr(32) in list(strip_and_cap_input):
    print("SP found in input")

if '.' in list(strip_and_cap_input):
    print(".  found in input")

if '!' in list(strip_and_cap_input):
    print("!  found in input")

if '?' in list(strip_and_cap_input):
    print("?  found in input")
1 Like

The other skill you’ll find of huge benefit is to learn about version control, using git, but that’s off topic for both this thread as well as this Forum.

For the capitalisation of days, months and seasons, I would build a dictionary for each, then substitute any non-capitalised words with the caped one, but you may want to use a different approach.

Example:

days = {
    "monday"    : "Monday",
    "tuesday"   : "Tuesday",
    "wednesday" : "Wednesday",
    "thursady"  : "Thursday",
    "friday"    : "Friday",
    "saturday"  : "Saturday",
    "sunday"    : "Sunday"
    }
print(days["monday"])

So, the way forward, would be to do one thing at a time, and perhaps build an output string as you go, then use that output string for the next stage, and so on, until you’ve covered all basses, and are ready to simply display the result.

It’s possible to do more than one operation at a time, but the code may not be so easy to read or de-bug, when things don’t work as expected.

Note: the way that I style my code may very well be at odds with ‘convention’, but as I work solo, I concern myself more with how things fall on the eye, than convention, also I have a disability that is less of an issue for me to have to deal with, if I format things the way that I do.

2 Likes

Hi, Rob. I am very appreciate your sharing, it is very inspiring me! I understand the principles of the syntax, but what if I want to make every single “i” to “I”, and the character after “.”, “!” and “?” will be capitalised. I attempted to use “.uppercase”, but it did not work.

#I have adjusted in the original syntax
if 'i' in list(strip_and_cap_input):
    strip_and_cap_input = "I"
# what if I want to capitalise the first letter after full-stop..
if '.' in list(strip_and_cap_input):
    strip_and_cap_input.upper

Your code is looking very well, I like to learn the ‘convention’ way and very appreciate your help. Could your please also offer some insights for me? I have built another dictionary for hyphenation, but I am no idea on how to combine the dictionary with syntax?
like:

days = {
    "monday"    : "Monday",
    "tuesday"   : "Tuesday",
    "wednesday" : "Wednesday",
    "thursady"  : "Thursday",
    "friday"    : "Friday",
    "saturday"  : "Saturday",
    "sunday"    : "Sunday"
    }

Ordinal_Numbers = {
    "twenty First"  :   "Twenty-First",
    "twenty Second" :   "Twenty-Second",
    "twenty Third"  :   "Twenty-Third",
    "twenty Fourth" :   "Twenty-Fourth",
    "twenty Fifth"  :   "Twenty-Fifth",
    "twenty Sixth"  :   "Twenty-Sixth",
    "twenty Seventh":   "Twenty-Seventh",
    "twenty Eighth" :   "Twenty-Eighth",
    "twenty Ninth"  :   "Twenty-Ninth",
}

You are very welcome.

I have been away from coding for longer than I’d have liked because ‘life’ had other plans for me, but I’m pleased to be back to it. I am a little rusty and as such, this is as much of a help to me, as it is to you.

Keep those dicts as you’ll most likely need them as you move this forward, but for now, let’s deal with the basics:

How about something like this, which deals with the 1st issue; making the first non-space character following a period, capitalized.

# Get the user input
outstring = ""

user_input = input("Enter string: ")

# check for some sanity #
char = ord(user_input[0]) 
if char < 65:
    print("Error in input")
    quit()
elif char > 90:
    if char < 97:
        print("Error in input")
        quit()        
if char > 122:
    print("Error in input")
    quit()
#--------------------------------#

# Strip off any trailing spaces and get the remaining lenght of the input
strip_input = user_input.strip()
input_len = len(strip_input)

#Capitalize the first non-space character in the string
cap_input = strip_input.capitalize()
outstring = cap_input[0]
#Capitalize the first non-space character after any period, exclamation mark or question mark
x = 1
p = 0
while x < input_len:
    char = cap_input[x]
    if p == 1:
        if char != ' ':
            char = char.upper()
            p = 0
    x += 1
    if char == '.':
        p = 1
    outstring += char

print(outstring)

I’ve taken the liberty of doing a sanity check on the input, as you’ll see. That kind of thing should always be the first thing you do whenever you are getting input from an un-trusted source. In fact, it should be made into a custom function and imported into any future apps that you develop. As time goes on, you should build your own collection of useful functions, so that you don’t have to ‘reinvent a wheel, in order to build a car’, so to speak.

That code, you should be able to modify so that it deals with a ‘!’ and so on, in the same way. I don’t know if you’ve got as far as creating custom functions yet, if so, excellent; you should be able to adapt the code into such. If not, no matter; simply work top down.

I’ll be AFK for a few hours, but I’m sure you’ll study that code and even improve upon it.

This site (Python Tutorial) is one of my ‘go to’ places for Python.

Thanks Rob! I modified it by using “if statement”, it meets the requirement one now!!

x = 1
p = 0
while x < input_len:
    char = cap_input[x]
    if p == 1:
        if char != ' ':
            char = char.upper()
            p = 0
    x += 1
    if char == '.' or char =='!' or char == '?':
        p = 1
    outstring += char

print(outstring)

Now for the requirement 2 & 3:

  1. To capitalize, (i) Day of the week; (ii) Month; and (iii) Seasons;
  2. hyphenation Ordinal numbers (eg, Forty-first, Ninety-second, etc) and Fractions (eg, half-, quarter-, one-third, three-fourteenth)

I design this syntax and utilise the prior “dictionary” function. It work, but could I combine these two syntax into one?

sentence = input("input your sentences: ").split()

#dictionary
mydict ={
    "i" : "I",
    "monday"    : "Monday",
    "tuesday"   : "Tuesday",
    "twenty First"  :   "Twenty-First",
    "twenty Second" :   "Twenty-Second",
    "twenty Third"  :   "Twenty-Third",
    "twenty Fourth" :   "Twenty-Fourth",
    "january"   :   "January",
    "feburay"   :   "Feburay",
    "march"     :   "March",
    "april"     :   "April",
}

for word in range(len(sentence)):
    if sentence[word] in mydict:
     sentence[word] = mydict[sentence[word]]

print (sentence)

Nice one!

Don’t forget you’ll need to deal with an apostrophe, so that “i’ve” becomes “I’ve”

A note regarding Terminology: Syntax is the structure of the Code, or ‘Code block’. e.g: if you were to omit the comer that separates the key value pairs of a dictionary code block, that would be a ‘syntax error’.

As for how you construct your code (some call them ‘scripts’, in Python) it’s a personal choice: just like any creative process, coding as an ‘art’, but use PEP 20 – The Zen of Python as a guiding light.

I think that you’re well on your way to completing this project and I’d be interested to learn how you get on and what feedback you get from your tutor.

If you get stuck, then do your best to come up with a solution, but if you hit wall, then just ask. There are many here that are only too happy to help.

Peace.

1 Like

I’m not sure how far along you are or when your project needs to be finalized, but I’ve added a method for the day name replacement.

#!/usr/bin/python3

# initialization #
outstring = ""
word = ""
x = 1
p = 0

days = {
    "monday"    : "Monday",
    "tuesday"   : "Tuesday",
    "wednesday" : "Wednesday",
    "thursady"  : "Thursday",
    "friday"    : "Friday",
    "saturday"  : "Saturday",
    "sunday"    : "Sunday"
    }

days_key_list   = list(days.keys())
days_value_list = list(days.values())
#----------------#


# Get the user input #
user_input = input("Enter string: ")

# check for some sanity #
char = ord(user_input[0])
if char < 65:
    print("Error in input")
    quit()
elif char > 90:
    if char < 97:
        print("Error in input")
        quit()
if char > 122:
    print("Error in input")
    quit()
#--------------------------------#

# Strip off any trailing spaces and get the remaining length of the input #
strip_input = user_input.strip()
input_len = len(strip_input)

# Capitalize the first non-space character in the string #
cap_input = strip_input.capitalize()
word = cap_input[0]

# Step through the user input one character at a time and do the following...
# Capitalize the first non-space character after any period, exclamation mark or question mark #
# Capitalize i if it's followed by a apostrophe or a space #
while x < input_len:
    char = cap_input[x]
    if p == 1:
        if char != ' ':
            char = char.upper()
            p = 0
    x += 1
    if char == '.' or char =='!' or char == '?':
        p = 1
    # if 'i' is followed by a ASCII code 39 (apostrophe), then capitalize the 'i'
    # if 'i' is followed by a space, then capitalize the 'i'
    if char == 'i':
        if cap_input[x] == chr(39) or cap_input[x] == ' ':
            char = char.upper()

    # substitute the day name
    if char == ' ' or x == input_len:
        for day in range((len(days))):
            if days_key_list[day] in word:
                word = word.replace(days_key_list[day], days_value_list[day])

    # if char is not a space, build a word
    word += char

    # if char is a space at this point, then add the word to the output and clear 'word'
    if char == ' ' or x == input_len:
        outstring += word
        word = ""

print('\n'+outstring)

You’ll be able to use that example to do the other word replacements that are required.

As I said from the outset, there’s more than one way to accomplish this; as an example, this could be re-coded so that the input is stored as a ‘list’ of words (rather than the character substitution approach that I’ve taken) and then the rules applied to the words before the output. I’m working on a character substitution cypher, which has had some influence on the way that I approached this.

I’ve also added some annotation, which as is always a good thing to do.