Convert text to clickable link

i200yrs · April 19, 2023, 5:54am

Hello All again…good day. I have the dataframe as below:

File_Code       File_Name
P001               tech1.doc
P002               tech2.doc
P003               tech3.doc

The files are located in the shared folder in the server as below:
“//ishare.xl.company.com/sites/NTC_Intranet/Shared Documents/Review”

My plan is save the dateframe into csv file…And in the csv i want to convert the text in cell as below into clickable links:

P001  
P002 
P003

Where when we click P001 it will open the tech1.doc and so on.
Please help me how to code in python…thanks a lot.

abessman · April 19, 2023, 6:07am

Not possible. A csv file is just a text file with comma delimited data. Text files do not support hyperlinks.

Instead, you could create a table in markdown (which does support hyperlinks), using pandas.DataFrame.to_markdown.

i200yrs · April 19, 2023, 6:18am

But i open the csv file using excel…or maybe instead of csv, i will save it as excel…thanks

abessman · April 19, 2023, 6:46am

Excel might work, never used it. Here’s a way to do it with markdown:

import pandas as pd

class DocumentTable:
    def __init__(self, file_names, path):
        d = {
            "File_Code": [
                f"[P{i+1:03}](<{path}/{f}>)" for i, f in enumerate(file_names)
            ],
            "File_Name": file_names,
        }
        self._df = pd.DataFrame(d)

    def to_markdown(self):
        return self._df.to_markdown()

example_data = DocumentTable(
    ["tech1.doc", "tech2.doc", "tech3.doc"],
    "//ishare.xl.company.com/sites/NTC_Intranet/Shared\ Documents/Review",
)
print(example_data.to_markdown())

Output:

	File_Code	File_Name
0	P001	tech1.doc
1	P002	tech2.doc
2	P003	tech3.doc

i200yrs · April 19, 2023, 9:37am

Is there other way? I successfuly did it using the code above bu looks like markdown is not what i want…

ajoino · April 19, 2023, 11:31am

This depends on the tool you use to open the csv file I think. I don’t know how excel would do it, but if I cat a csv file (Possibly Linux only) in Alacritty (my current terminal emulator) I get clickable links. Though it must be said, cating csv files is a pain if they have more a couple of columns

rob42 · April 19, 2023, 11:42am

As said: you’ll need to decide on what you’re going to use to display the links. If you’re going to use a spreadsheet app, such as Excel, and load the links via a .csv file, you’ll have an issue, because of the way that the ‘link’ has to be formatted.

As a spreadsheet cell example:

=HYPERLINK("[https://ishare.xl.company.com/sites/NTC_Intranet/Shared%5C%20Documents/Review/tech1.doc]","tech1.doc")

As you can see, there is a comma in there, which is what you’re gong to fall over with a .csv import. There are import options (switches) that you can set/unset so that the fields are delimited by something other than a comma, but that is (so far as I know) a manual operation.

To add (just a ‘btw comment’): there is a space character in the example link, which maybe should not be there. I can see that it’s there in your post, which is why it’s duplicated in the replies.

i200yrs · April 19, 2023, 3:18pm

I found the work around…I just add another column, i named it “Link”.
Then i populated the column with correspanding file path url.
Then i added the code below:

df['Link'] = df['Link'].map('=HYPERLINK("{}")'.format)

I save the df to xlsx. I can click and open the file already.

However, I have 2 things i want to do, hope someone can help.
1.) The file pat is too long, is there a way using python to display other names for url?
2.) Possible to change the url into blue text (like a clickable text)?

Thanks in advance,

rob42 · April 20, 2023, 12:33am

Not quite sure how you intend to ‘display’ the urls: I thought that you were going to have them a the spreadsheet?

Another option would be to have a html table.

This script will generate the html code and display that, but it could be adapted to write said to a file, which could then be parsed in a web browser:

import pandas as pd
import html

urls = {
    'P001': 'https://example.com/tech1.doc',
    'P002': 'https://example.com/tech2.doc',
    'P003': 'https://example.com/tech3.doc'
}


def make_link(url, link):
    return f'<a href="{url}">{link}</a><br>'

links = []

for item in urls:
    links.append(make_link(urls[item], item))

df = pd.DataFrame([link for link in links], columns=['urls'])
df = df.set_index('urls')
print(html.unescape(df.to_html(render_links=True)))

To make life a little easier for myself, I’ve code this with https://example.com rather than the long urls that you need for the server connection.

Not sure if this will help or not, but it may be of help to someone else, even if it’s not for you.

i200yrs · April 20, 2023, 2:22am

I got error while install html.

C:\>pip install html
Collecting html
  Using cached html-1.16.tar.gz (7.6 kB)
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [1 lines of output]
      ERROR: Can not execute `setup.py` since setuptools is not available in the build environment.
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

[notice] A new release of pip available: 22.3.1 -> 23.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip

Any advice please.

i200yrs · April 20, 2023, 4:06am

I got the solution, I used → pip install requests-html
Thanks a lot…

The output is like this:

urls
<a href="https://example.com/tech1.doc">P001</a><br>
<a href="https://example.com/tech2.doc">P002</a><br>
<a href="https://example.com/tech3.doc">P003</a><br>

And it is not clickable…any idea what is wrong…thanks

abessman · April 20, 2023, 5:25am

html is a built-in module. It is already installed. requests-html is an HTLM parser, which is not what you need.

They are clickable if you view them in an HTML-aware application. Try saving the output in a file named ‘documents.html’ and open it with your browser.

i200yrs · April 20, 2023, 6:04am

Have tried to do df.to_html(‘docs.html’) but i got same results.

abessman · April 20, 2023, 6:14am

Try adding escape=False to the arguments passed to df.to_html.

rob42 · April 20, 2023, 8:01am

Yes; that’s my bad: having print(df.to_html(render_links=True, escape=False)) means that the import html can be dropped – one less package import, is always a good thing.

@i200yrs
Another way to save the file:

with open('links.htm', mode='w', encoding='UTF-8') as output:
    print(df.to_html(render_links=True, escape=False), file=output)

I use .htm for files such as this, because .html implies that the page would have all the fields and tags, that a ‘valid html page’ needs, be in place.