6

I am having some text which may or may not contain a country name in it. for example:

' Nigeria: Hotspot Network LTD Rural Telephony Feasibility Study'

this is how I extract the country name from it. in my first attempt:

findcountry("Nigeria: Hotspot Network LTD Rural Telephony Feasibility Study")

def findCountry(stringText):
    for country in pycountry.countries:
        if country.name.lower() in stringText.lower():
            return country.name
    return None

unfortunately, it gives me the wrong output as [Niger] whereas the correct one is Nigeria. Note Niger and Nigeria are two different existing countries in the world.

in second attempt:

def findCountry(stringText):
    full_list =[]
    for country in pycountry.countries:
        if country.name.lower() in stringText.lower():
            full_list.append(country)

    if len(full_list) > 0:
        return full_list

    return None

I get ['Niger', 'Nigeria'] as output. but I can't find a way to get Nigeria as my final output. How to achieve this.

Note: here I know Nigeria is the correct answer but later one I will put it to the code to choose the final country name if present in the text and it should be having very high accuracy for detection.

3
  • stackoverflow.com/questions/48607339/… this is what you are looking for I suppose. Commented May 31, 2021 at 4:59
  • Sort countries by the length of their names, in descending order. Commented May 31, 2021 at 5:01
  • @Tangent I am using the same library but steps. as I already mentioned I need the correct single answer where I get wrong answer Commented May 31, 2021 at 5:01

4 Answers 4

8

Always search for longest strings first; this will prevent the kind of error you encountered.

countries = sorted(pycountry.countries, key=lambda x: -len(x))
Sign up to request clarification or add additional context in comments.

2 Comments

@Aamdan sorry man, I could not understand where and how to use this code. could you please give a hint over it
You are iterating over pycountry.countries, which is not sorted; iterating over these sorted countries instead should give you the correct answer.
3

One regex approach would be to build an alternation containing all target countries to be found. Then, use re.findall on the input text to find any possible matches:

regex = r'\b(?:' + '|'.join(pycountry.countries) + r')\b'

def findCountry(stringText):
    countries = re.findall(regex, stringText, flags=re.IGNORECASE)
    return countries

5 Comments

it returns me empty list , a small change is required to run the program. inside join method we should write country.name for country in pycountry.countries as it requires text instead of Country object. in final version when I pass my string in findall it returns empty list instead of Nigeria
@TalibDaryabi Check the updated answer and try running the regex search in case insensitive mode.
it still returns me an empty list. I run the code like this: regex = r'\b(?:' + '|'.join(country.name.lower() for country in pycountry.countries) + ')\b' countries = re.findall(regex, title, flags=re.IGNORECASE)
title is the stiring having Nigeria in it
Lol I have no reading comprehension apparently :D Sorry...
2

I got the correct answer like this:

def findCountry(stringText):
    countries = sorted([country.name for country in pycountry.countries] , key=lambda x: -len(x))
    for country in countries:
        if country.lower() in stringText.lower():
            return country
    return None

following @Amandan solution in this question.

Comments

2

The problem here is in works for occurrence. So Niger is true for Nigeria. You can also change the placement for variables before and after in but that will solve for Nigeria but not for others. You can use == which will solve all the case.

def findCountry(stringText):
    for country in pycountry.countries:
        if country.name.lower() == stringText.lower():
            return country.name
    return None

2 Comments

Thank you man, I need the answer should work for all other conditions as well
You are most welcome @TalibDaryabi. Let me know if it solves your problem?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.