You need to be logged in to post in the forum - Log In

An active JCE Pro Subscription is required to post in the forum - Buy a Subscription

Support is currently Offline

Official support hours
Monday to Friday
09:00 - 17:00 Europe/London (BST)

Please create a new Ticket and we will get back to you as soon as we can.

#112369 After setting "Keep non-breaking spaces" and "Pad Empty Tags" to "no", I stil get nbsp

Posted in ‘Editor’
This is a public ticket

Everybody will be able to see its contents. Do not include usernames, passwords or any other sensitive information.

Latest post by twhite on Thursday, 24 November 2022 15:23 GMT

twhite
After setting "Keep non-breaking spaces" and "Pad Empty Tags" to "no", I stil get nbsp
in Joomla 3.10, or 4.25, when using JCE 29.32 pro, firefox v 107.

1.) I cut and paste a paragraph that has text that has nbsp; in it such as following

<strong>Recuerdos de Śrīpāda Govardhana Bābājī Mahārāja</strong>

2.) I verify that JCE has changed the nbsp to regualr space character.
3.) I save the article, check contents of the database, and see no nbsp; in article.
4.) While viewing article from front end, spacing is screwed.
5.) When examining with firefox tools, i see some nbsp; but not all spaces are nbsp; as in following example:

<strong>Recuerdos de nbsp;Śrīpāda nbsp;Govardhana nbsp;Bābājī nbsp;Mahārāja</strong> [Śrīpāda nbsp;Govardhana nbsp;Bābājī nbsp;Mahārāja, anteriormente llamado nbsp;Śrī nbsp;Nṛsiṁhānanda nbsp;Brahmacārī, es un discípulo de nbsp;Śrīla nbsp;Bhaktivedānta nbsp;Vāmana Gosvāmī nbsp;Mahārāja nbsp;y fue su sirviente personal durante muchos años].

In above text I took out the & so the nbsp; would not disappear. I realize this is not a JCE problem but it is driving me nuts. Grateful for any explanation of issue.

Ryan
When there are more than 2 psaces in text, all the spaces after the first will always be nbsp; regardless of the JCE setting. The is correct HTML syntax.

Ryan Demmer

Lead Developer / CEO / CTO

Just because you're not paranoid doesn't mean everybody isn't out to get you.

twhite
Thanks for your response. There are not two spaces, the space you see next to the nbsp; was where the & went as in  . I took it out for posting purposes because if I left it in, then you would not be able to see where the   was there as it would turn into a space. I could have replaced the semicolons with colons as follows for same effect:

<strong>Recuerdos de :Śrīpāda :Govardhana :Bābājī :Mahārāja</strong> [Śrīpāda&nbsp:Govardhana&nbsp:Bābājī&nbsp:Mahārāja, anteriormente llamado&nbsp:Śrī&nbsp:Nṛsiṁhānanda&nbsp:Brahmacārī, es un discípulo de&nbsp:Śrīla&nbsp:Bhaktivedānta&nbsp:Vāmana&nbsp:Gosvāmī&nbsp:Mahārāja&nbsp:y fue su sirviente personal durante muchos años].

So still baffled...

Ryan
1.) I cut and paste a paragraph that has text that has nbsp; in it such as following

Recuerdos de Śrīpāda Govardhana Bābājī Mahārāja


Where is the content originally from? It seems like some of the single spaces a utf-8 non-breaking spaces (which is unusual) which are converted into nbsp when the content is processed.

Ryan Demmer

Lead Developer / CEO / CTO

Just because you're not paranoid doesn't mean everybody isn't out to get you.

twhite
My revised analysis below:

1.) Where is the content originally from?
One person sends a MS doc file. Why she doesn't have ability for MS docX, i don't know.
I open the file on mac os with latest Pages. If I cut and paste from Pages into JCE, I loose all the italics.
So I export the file from Pages to MS Word docX file. I then open it in Virtual Machine running Windows 10 with Word 2007.
Then save as "Single File Web Page" and it gets the ".mht" file extension. I open this file with Opera Browser, latest version on mac.

2. How are the spaces saved in the MHT file?
By doing a "hexdump -C filename" on this file, I see the place where the non breaking space goes as two bytes, "0d 0a". What these are I don't know. Other spaces are just encode as regular spaces as one byte hex "20" . (sample file attached)

3.) What does JCE do with these 0d 0a bytes?
In JCE Global configuration, If I have I have "Keep non-breaking spaces" set to Yes, when I cut and paste text from this web page, JCE converts these "0d0a" bytes to six characters " :" , ( the last one being a semicolon rather than a colon).
When I save the article, What ends up in the database are these six characters which can be seen if I dump the article from command line by
echo "select introtext from #_content where id = 48;" | myql database-name | hexdump -C
BTW, JCE does a great job of with MS Word cleanup.

4.)What does JCE do with 0d 0a bytes when "Keep non-breaking spaces" is set to No?
When I cut and paste text from this web page, JCE converts these "0d0a" to "c2a0" which is hex form for " :"
While in JCE, when I toggle off the editor to see the raw HTML, I see only a space, and not the " "; as before.
However, when I save the article and dump the database, I see the non-breaking space is still there, just in another form as "c2a0"

5.)Is there anyway to convert the non-breaking spaces to a regular space?
I can do this manually by leaving "Keep non-breaking spaces" set to Yes, and then toggling the editor and replacing them, but this is really time consuming.
I could also just paste the html from JCE into vim and do search and replace, then paste it back. But not all users have that ability.
I wonder if it would be possible for JCE to have a button to convert all "c2a0" to "20" and one to convert all " :" to regular space?



Attachments

sample.zip

Ryan
I think this could probably be solved in step 1.

Can you attach a zip of the original .doc file?

Ryan Demmer

Lead Developer / CEO / CTO

Just because you're not paranoid doesn't mean everybody isn't out to get you.

twhite
Thanks, here is original doc file attached.

Attachments

sample_doc.zip

Ryan
I'm not seeing a lot of issues when pasting content from Pages. The italics are still present, and there don't seem to be nbsp problems.

Ryan Demmer

Lead Developer / CEO / CTO

Just because you're not paranoid doesn't mean everybody isn't out to get you.

twhite
I don't understand why when I cut and paste from pages the italics disappear and when I do it from the web page they stay.

Must be some setting in my JCE profile. Can you send me a copy of your JCE profile to look at?

I attached picture of my latest cut and paste from pages. The word pūjyapāda and hari-kathā should be italicized.

Attachments

Ryan
Please post a screenshot of the parameters page at Editor Profiles -> Plugin Parameters -> Clipboard.

Ryan Demmer

Lead Developer / CEO / CTO

Just because you're not paranoid doesn't mean everybody isn't out to get you.

twhite
Screen shot attached. I am starting to feel stupid :-) I am sure it is something very simple.

Attachments

Ryan
Try these settings:

Remove All Spans : No
Remove All Styles : Yes
Styles to keep: [blank]
Styles to remove: [blank]
Remove Tags: [blank]
Keep Tags: [blank]
Remove Attributes: class
Allow Event Attributes: No
Remove empty paragraphs: Yes

Pages appears to use for italics and for bold text, but you can search/replace these (to and ) in the Code tab if you want to.

Ryan Demmer

Lead Developer / CEO / CTO

Just because you're not paranoid doesn't mean everybody isn't out to get you.

twhite
Thanks, I tried the settings above, checked several times, to see if they were set just like you listed.
Logged out, and back in, several times.
Still do not have desired results. The italics and bold do not transfer over on paste.
I am using Pages version 12.1 and Firefox 107.

I have attached 3 new images, of the settings, bigger image size this time.

Attachments

Ryan
Please send me a login - https://www.joomlacontenteditor.net/contact/site-login

Ryan Demmer

Lead Developer / CEO / CTO

Just because you're not paranoid doesn't mean everybody isn't out to get you.

twhite
thanks, sent

Ryan
This appears to be some sort of bug in Firefox, which doesn't extract or transfer the HTML data from the clipboard when pasting from Pages. It works fine in Chrome, Brave, Safari etc.

Ryan Demmer

Lead Developer / CEO / CTO

Just because you're not paranoid doesn't mean everybody isn't out to get you.

twhite
You are right, I tried Safari and Opera and they work fine. Also no issue with the non-breaking spaces, they are completely changed to regular spaces. I will be more careful with using firefox forpasting.

Thanks for discovering this. I have been baffled by this for some time.